You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying the extension for computing weak connected components. I am using duckdb 1.1.3 and it works on my laptops (I tested on Mac Intel and Mac M4). For some reason on my server machine, I get a Segfault. There is no further information printed, unfortunately
import duckdb
import pandas as pd
import time
# Start the timer
# Read the TSV file
print("Loading the data")
input_file = "links.tsv" # Replace with the actual file path
data = pd.read_csv(input_file, sep="\t")
# Ensure the TSV has the correct columns
if "a" not in data.columns or "b" not in data.columns:
raise ValueError("The input TSV must contain 'a' and 'b' columns")
# Connect to an in-memory DuckDB instance
db = duckdb.connect(":memory:")
print(duckdb.__version__)
# Load the DataFrame into DuckDB as a table
db.register("data_table", data)
print("Installing extension")
db.execute("INSTALL duckpgq FROM community")
db.execute("LOAD duckpgq")
# Create vertex table in DuckDb
db.execute(
"CREATE TABLE devices AS SELECT a AS device FROM data_table UNION SELECT b AS device FROM data_table"
)
# Create an edges table in DuckDB
db.execute("CREATE TABLE edges AS SELECT * FROM data_table")
# Create a property graph from the edges
query_create_graph = """
CREATE PROPERTY GRAPH graph
VERTEX TABLES (
devices
)
EDGE TABLES (
edges
SOURCE KEY (a) REFERENCES devices(device)
DESTINATION KEY (b) REFERENCES devices(device)
LABEL connects
);
"""
db.execute(query_create_graph)
print("Computing WCC")
start_time = time.time()
# Compute the weakly connected components
query_wcc = """
FROM weakly_connected_component(graph, devices, connects);
"""
wcc_result = db.execute(query_wcc).fetchdf()
# Print or save the results
# print(wcc_result)
output_file = "weakly_connected_components.csv"
wcc_result.to_csv(output_file, index=False)
print(f"Weakly connected components saved to {output_file}")
# Drop the property graph
db.execute("DROP PROPERTY GRAPH graph")
# Close the connection
db.close()
# End the timer and print the runtime
end_time = time.time()
runtime = end_time - start_time
print(f"Script runtime: {runtime:.2f} seconds")
OS:
NixOS 24.05 (Uakari)
DuckDB Version:
1.1.3
DuckDB Client:
Python
Full Name:
Anak Bagus
Affiliation:
Arista Networks
How did you load the extension?
Community extension version
Did you include all relevant data sets for reproducing the issue?
No - I cannot share the data sets because they are confidential
Did you include all code required to reproduce the issue?
Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
Yes, I have
The text was updated successfully, but these errors were encountered:
Hi @prdx, thank you for the bug report. Unfortunately, I can't reproduce the bug with the information given. Could you reproduce this segmentation fault using (anonymized) data? Otherwise, would it be possible to mail the file (or anonymized data) to me ([email protected]). I would like to solve this bug :)
What happens?
Hi everyone, thanks for the work! Very promising!
I am trying the extension for computing weak connected components. I am using duckdb 1.1.3 and it works on my laptops (I tested on Mac Intel and Mac M4). For some reason on my server machine, I get a Segfault. There is no further information printed, unfortunately
I am attaching the stack trace here:
Stacktrace: https://gist.github.com/prdx/ce6d850697e00e7efd957b53e2c3217e
To Reproduce
I am using this python script
OS:
NixOS 24.05 (Uakari)
DuckDB Version:
1.1.3
DuckDB Client:
Python
Full Name:
Anak Bagus
Affiliation:
Arista Networks
How did you load the extension?
Community extension version
Did you include all relevant data sets for reproducing the issue?
No - I cannot share the data sets because they are confidential
Did you include all code required to reproduce the issue?
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
The text was updated successfully, but these errors were encountered: