Parquet files as Delta Table read by Azure Databricks from Azure Storage on public network or Microsoft network? - Stack Overflo

We created Delta Table in Azure Databricks. We have Parquet files being stored in Azure Storage. Is thi

We created Delta Table in Azure Databricks. We have Parquet files being stored in Azure Storage. Is this data read being on public network or Microsoft network? As of now Azure Storage and Azure Databricks both are NOT in any our VNet. Adding both of them will improved read speed? Creating Private Endpoint on Azure Storage will ensure read through Microsoft network?

We created Delta Table in Azure Databricks. We have Parquet files being stored in Azure Storage. Is this data read being on public network or Microsoft network? As of now Azure Storage and Azure Databricks both are NOT in any our VNet. Adding both of them will improved read speed? Creating Private Endpoint on Azure Storage will ensure read through Microsoft network?

Share Improve this question asked Mar 23 at 11:40 knowdotnetknowdotnet 9373 gold badges16 silver badges34 bronze badges 1
  • can you please provide what is the approach you have tried so far? – Dileep Raj Narayan Thumula Commented Mar 24 at 3:25
Add a comment  | 

1 Answer 1

Reset to default 3

If your Azure Storage and Azure Databricks are not in any VNet, the data read is happening over the public network. To make sure that the data read happens over the Microsoft network, you can use Azure Private Link to create private endpoints for both Azure Storage and Azure Databricks.

Creating private endpoints will make sure that the traffic between Azure Databricks and Azure Storage remains within the Microsoft network, which can improve security and potentially improve read speed by avoiding the public internet.

ADLS Gen2 operates on a shared architecture. To securely access it from Azure Databricks, there are two available options:

  1. Service Endpoints
  2. Azure Private Link

You can choose either from the above approaches for Securing access between Azure Databricks (ADB) and ADLS Gen2 requires the ADB workspace to be VNet-injected, regardless of the approach used.

When a storage account is configured with a private endpoint, a firewall is enabled by default. To allow access, the VNet and subnets used by Databricks must be added to the firewall settings, as shown below.

After this you can mount the ADLS However, to read files from the folder, you also need to manage ACLs for both the container and the files.

The same can be done for files by right-clicking on the file that needs to be accessed from the Databricks notebook.

Know more how to Secure Access to Storage: Azure Databricks and Azure Data Lake Storage Gen2 Patterns

Deploy Azure Databricks in your Azure virtual network (VNet injection)

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744288133a4566899.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信