I am trying to run a complex query in pyspark using jdbc connection. But I am facing issue with query containing CTE.
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("TestApp").getOrCreate()
jdbc_url=sourceconnectiondetails['url']
prepare_query = f"""WITH CombinedCompanies AS ( Select * from FinancialsSpark..financialperiodV2_tbl temp WITH(Nolock))"""
query = f""" Select * from CombinedCompanies"""
df = (spark.read
.format("jdbc")
.option("url", sourceconnectiondetails['url'])
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
.option("prepare_query", prepare_query)
.option("query", query)
.option("user", sourceconnectiondetails['user'])
.option("password", sourceconnectiondetails['password'])
.load())
df.show()
The CTE is complicated than this example and CTEs are the only way to run the whole query. I have tried to run the CTE query in prepareQuery and the select statement in query. Followed this source to use prepareQuery.
The error is com.microsoft.sqlserver.jdbc.SQLServerException: Invalid object name 'CombinedCompanies'
Can anyone help on how to use CTEs in pyspark?
Similar issue - Here. But the accepted answer is not working for me
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744411445a4572896.html
评论列表(0条)