I tracked down a segfault to the provided snippet when parsing big raw CSVs. One of the values (here under "header2") is a hexadecimal 16-character string, and under some conditions, it cannot be ...
When calling df.drop_duplicates on a pandas Dataframe using the pyarrow backend, it incorrectly removes unique columns and keeps non unique columns. This only occurs with the pyarrow version 21.0.0, ...