pyspark - Error converting spark dataframe to pandas: TypeError: Casting to unit-less dtype 'datetime64' is not

admin•2025-04-26 06:35:20•questions•阅读3

I'll create a demo dataframe to recreate the error that I see in databricks.from pyspark.sql.type

I'll create a demo dataframe to recreate the error that I see in databricks.

from pyspark.sql.types import StructType, StructField, TimestampType, StringType
from datetime import datetime

# Define the schema
schema = StructType([
    StructField("session_ts", TimestampType(), True),
    StructField("analysis_ts", TimestampType(), True)
])

# Define the data with datetime objects
data = [
    (datetime(2023, 9, 15, 17, 30, 41), datetime(2023, 9, 15, 17, 47, 3)),
    (datetime(2023, 10, 24, 18, 23, 37), datetime(2023, 10, 24, 18, 25, 16)),
    (datetime(2024, 1, 15, 6, 38, 52), datetime(2024, 1, 15, 6, 48, 15)),
    (datetime(2024, 2, 21, 13, 16, 37), datetime(2024, 2, 21, 13, 22, 35)),
    (datetime(2023, 10, 18, 17, 52, 28), datetime(2023, 10, 19, 17, 11, 3))
]

# Create a DataFrame
df = spark.createDataFrame(data, schema=schema)

When I try to convert the pyspark dataframe to pandas I get the error: TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.

df.toPandas().head()

Casting the fields as TimestampType did not resolve the error.

df = df.withColumn("session_ts", df["session_ts"].cast(TimestampType()))
df = df.withColumn("analysis_ts", df["analysis_ts"].cast(TimestampType()))
df.toPandas()

I was only able to proceed by casting as string, which seems an uneccessary workaround.

df = df.withColumn("session_ts", df["session_ts"].cast(StringType()))
df = df.withColumn("analysis_ts", df["analysis_ts"].cast(StringType()))
df.toPandas()

I'll create a demo dataframe to recreate the error that I see in databricks.

from pyspark.sql.types import StructType, StructField, TimestampType, StringType
from datetime import datetime

# Define the schema
schema = StructType([
    StructField("session_ts", TimestampType(), True),
    StructField("analysis_ts", TimestampType(), True)
])

# Define the data with datetime objects
data = [
    (datetime(2023, 9, 15, 17, 30, 41), datetime(2023, 9, 15, 17, 47, 3)),
    (datetime(2023, 10, 24, 18, 23, 37), datetime(2023, 10, 24, 18, 25, 16)),
    (datetime(2024, 1, 15, 6, 38, 52), datetime(2024, 1, 15, 6, 48, 15)),
    (datetime(2024, 2, 21, 13, 16, 37), datetime(2024, 2, 21, 13, 22, 35)),
    (datetime(2023, 10, 18, 17, 52, 28), datetime(2023, 10, 19, 17, 11, 3))
]

# Create a DataFrame
df = spark.createDataFrame(data, schema=schema)

When I try to convert the pyspark dataframe to pandas I get the error: TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.

df.toPandas().head()

Casting the fields as TimestampType did not resolve the error.

df = df.withColumn("session_ts", df["session_ts"].cast(TimestampType()))
df = df.withColumn("analysis_ts", df["analysis_ts"].cast(TimestampType()))
df.toPandas()

I was only able to proceed by casting as string, which seems an uneccessary workaround.

df = df.withColumn("session_ts", df["session_ts"].cast(StringType()))
df = df.withColumn("analysis_ts", df["analysis_ts"].cast(StringType()))
df.toPandas()

Share Improve this question asked Nov 18, 2024 at 20:51 Joe 3,8164 gold badges23 silver badges48 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

1) Ensure datetime64[ns] During Conversion

import pyspark.sql.functions as F

Explicitly cast timestamps to ensure compatibility

df = df.withColumn("session_ts", F.col("session_ts").cast("timestamp")) df = df.withColumn("analysis_ts", F.col("analysis_ts").cast("timestamp"))

Convert to pandas

pdf = df.toPandas() print(pdf.head())

2) Disable PyArrow for Conversion (Fallback to Legacy Conversion)

Disable PyArrow during the conversion

spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false")

Convert to pandas

pdf = df.toPandas() print(pdf.head())

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1745595112a4635064.html

admin

questions
javascript - How to fix in Tailwindcss div height overflows parent div - Stack Overflow
My card element has a fixed height of h-80 (in Tailwind). usually I use the card in a grid.Now, I have
admin
27分钟前
30
questions
javascript - Angular 7 nested observables - Stack Overflow
I have two collections: persons and pets. Every pet has personId. My target is to get all persons and f
admin
26分钟前
30
questions
refresh - How to update a web page from javascript without exiting javascript - Stack Overflow
I would like to animate an html page with something like this:function showElements(a) { for (i=1; i&
admin
26分钟前
30
questions
javascript - Getting ClientID of control in RadGrid edit form - Stack Overflow
I have Telerik RadGrid with a custom edit form. In my edit form there is a RadDatePicker to which I hav
admin
25分钟前
10
questions
python - Reorder Numpy array by given index list - Stack Overflow
I have an array of indexes:test_idxs = np.array([4, 2, 7, 5])I also have an array of values (which is
admin
23分钟前
10
questions
javascript - Clear input box on click with Jquery - Stack Overflow
I have a simple code that I'm trying to wrok into my website to clear a textbox with a default val
admin
19分钟前
00
questions
customization - Custom table or form
i need a form or table plugin that can be customized to do this:For example:I am a parcel shipping company. I need some
admin
16分钟前
00
questions
javascript - Rx.Observable.bindCallback with scope in rxjs - Stack Overflow
It seems in rxjs 4.x, Rx.Observable.fromCallback accept scope as the second parameter, but in 5.0, this
admin
14分钟前
10
questions
javascript - Jest failing because of importexport syntax - Stack Overflow
One of the files under test is importing the estree-util-to-js library, which causes Jest to exit with
admin
13分钟前
10
questions
javascript - Uncaught TypeError: arr.filter is not a function - Stack Overflow
I have printed arr values in console they are like [object Object],[object Object],[object Object],[obj
admin
12分钟前
10
questions
c# - Using Buildconfiguration instead of launchsettings.json for Settings for RemoteDebbuging - Stack Overflow
I have to remote debug on many Machines and currently it is done with an Outputpath in the *.csproj Bui
admin
11分钟前
10
questions
How do I switch between multiple browser windows or tabs in Selenium with Java? - Stack Overflow
I am working with multiple tabs in a browser, but I can’t figure out how to switch from one windowtab
admin
10分钟前
10
questions
javascript - .append() is not appending PHP Code - Stack Overflow
Hi I am a beginner in PHP and i am trying to append a simple include php code in a <section>in ht
admin
10分钟前
00
questions
deobfuscation - Deobfuscating Javascript Hex Encoded Variables - Stack Overflow
Now simple hex deob is simple, I am curious if there is a tool to rename usages of variables. For examp
admin
7分钟前
00
questions
javascript - Make a navbar turn into a Hamburger push-outslide-out menu when the screen gets too small - Stack Overflow
Welp... I'm stuck... again...(FYI This is my first ever website that I am coding myself (not a tem
admin
5分钟前
00
questions
performance - How to cache a different page version based on HTTP header?
My website has an HTTP response header generated by Cloudflare that contain the visitor country code, and i can use that
admin
3分钟前
00
questions
Update Docker volume when image is updated - Stack Overflow
I'm running a Wordpress site in a PHP fpm container with NGINX running in it's own container
admin
3分钟前
00
questions
selenium webdriver - I am unable to find links from python scraping code for exhibition websites - Stack Overflow
I am new to Python and web scraping. I am a beginner in programming and am still practicing. I am using
admin
1分钟前
00
questions
javascript - PixiJS what's the best way to change a Graphics object's colour? - Stack Overflow
I'm trying to make a simple square object flash green, blue, and red based on different conditions
admin
56秒前
00
questions
Block Adsense on specific page
Set-upI have the following standard Adsense script inserted in my header.php file, <script async src="pagead2.
admin
51秒前
00

发表回复

评论列表（0条）

暂无评论

pyspark - Error converting spark dataframe to pandas: TypeError: Casting to unit-less dtype 'datetime64' is not

1 Answer 1

1) Ensure datetime64[ns] During Conversion

Explicitly cast timestamps to ensure compatibility

Convert to pandas

2) Disable PyArrow for Conversion (Fallback to Legacy Conversion)

Disable PyArrow during the conversion

Convert to pandas

发表回复

评论列表（0条）

联系我们

400-800-8888

pyspark - Error converting spark dataframe to pandas: TypeError: Casting to unit-less dtype &#39;datetime64&#39; is not

1 Answer 1

1) Ensure datetime64[ns] During Conversion

Explicitly cast timestamps to ensure compatibility

Convert to pandas

2) Disable PyArrow for Conversion (Fallback to Legacy Conversion)

Disable PyArrow during the conversion

Convert to pandas

相关推荐

发表回复

评论列表（0条）

联系我们

400-800-8888

pyspark - Error converting spark dataframe to pandas: TypeError: Casting to unit-less dtype 'datetime64' is not