To store graph insights from banking fraud detection (covering IDT, SID, ATO) for downstream investigation and mitigation, design a table schema that captures algorithm output, network features, entity context, and traceability links. The schema should support storage of graph metrics, community and cluster data, and allow for extensibility (such as JSON fields) to accommodate evolving analytics.
| Column Name | Data Type | Description |
|---|---|---|
| insight_id | UUID / PK | Unique identifier for each insight |
| entity_id | VARCHAR | Primary node/entity reference (user/account/device/etc.) |
| entity_type | VARCHAR | Type of entity (user, account, device, etc.) |
| insight_type | VARCHAR | Algorithm/category: e.g., 'IDT', 'SID', 'ATO', 'centrality', 'path', 'community', 'anomaly' |
| algorithm_used | VARCHAR | Algorithm name: e.g., 'pagerank', 'shortest_path', 'louvain', 'similarity' |
| score | FLOAT | Numeric metric: centrality score, path score, similarity, cluster density, etc. |
| related_entities | JSON | List of referenced node/entity IDs involved in the detected structure or algorithm result |
| cluster_id | VARCHAR | Community or cluster label for the entity (if relevant, else NULL) |
| path_details | JSON | Traversal path data: ordered entity IDs, hops, and relationships (optional) |
| features | JSON | Computed features: e.g., {in_degree, out_degree, num_neighbors, avg_path_length} |
| insight_timestamp | TIMESTAMP | When the algorithm detected/stored this insight |
| raw_algorithm_output | JSON | Raw/full output dump of algorithm if complex or needed for audit (optional, for traceability) |
| investigation_status | VARCHAR | Enum: 'open', 'reviewed', 'escalated', 'resolved' |
| notes | TEXT | Analyst comments or contextual notes on the insight |
- related_entities and path_details fields allow efficient representation of non-tabular graph structures for path finding, ring detection, or fraud rings as input for further queries or visualization.12
- features field is extensible for per-entity graph features (centrality, anomaly score, local subgraph stats, etc.) and allows downstream teams to adapt schema for new metrics.34
- raw_algorithm_output captures full logs or intermediate results if future transparency, explainability, or deep-dive investigation is necessary.4
- entity_type and insight_type fields let investigators quickly filter and pivot between types of entities (accounts, devices, IPs) and insight categories (fraud ring, outlier, high-risk link, etc.).5
- cluster_id and score fields facilitate clustering/community analysis, reflecting algorithm output for fraud rings and collusive networks.34
- Use JSON fields generously for algorithm details, path data, and computed features to ensure schema flexibility and future-proofing, while keeping core indexes and keys as first-class columns for query/index performance and reporting.64
{
"insight_id": "d2f7e86c-259c-4f90-905c-1fae9db1df47",
"entity_id": "acct_12345678",
"entity_type": "account",
"insight_type": "ATO_centrality",
"algorithm_used": "pagerank",
"score": 0.864,
"related_entities": ["user_221", "device_54a", "acct_8765123"],
"cluster_id": "community_45",
"path_details": {
"path": ["acct_12345678", "device_54a", "acct_8765123"],
"relations": ["accessed_by", "accessed_by"]
},
"features": {
"in_degree": 5,
"out_degree": 2,
"local_cluster_density": 0.78
},
"insight_timestamp": "2025-09-05T10:05:11Z",
"raw_algorithm_output": {"scores": {"acct_12345678": 0.864, ...}},
"investigation_status": "open",
"notes": "Flagged for further review: unusually high centrality and connection to known fraud ring."
}This schema, mixing indexed and JSON fields, gives downstream teams structured, detailed, and context-rich graph insights for investigation and mitigation.643 7891011121314151617181920
Footnotes
-
https://go.neo4j.com/rs/710-RRC-335/images/Neo4j_WP-Fraud-Detection-with-Graph-Databases.pdf ↩
-
https://www.nebula-graph.io/posts/fraud-detection-with-graph-analytics ↩
-
https://www.tigergraph.com/blog/uncovering-financial-fraud-how-graph-technology-transforms-detection-in-banking/ ↩ ↩2 ↩3
-
https://go.neo4j.com/rs/710-RRC-335/images/Neo4j-Financial-Fraud-Detection-GDS-white-paper-EN-A4.pdf ↩ ↩2 ↩3 ↩4 ↩5
-
https://www.tigergraph.com/glossary/fraud-detection-with-graph/ ↩
-
https://www.tigergraph.com.cn/wp-content/uploads/2021/07/TigerGraph-Fraud-Solution-EN-Version.pdf ↩
-
https://neo4j.com/blog/developer/exploring-fraud-detection-neo4j-graph-data-science-part-1/ ↩
-
https://aws.amazon.com/blogs/database/use-graph-machine-learning-to-detect-fraud-with-amazon-neptune-analytics-and-graphstorm/ ↩
-
https://www.sciencedirect.com/science/article/pii/S0167923620300580 ↩
-
https://www.dataversity.net/the-power-of-graph-databases-to-detect-fraud/ ↩
-
https://www.formica.ai/blog/detecting-anomalies-with-fraud-graphs ↩
-
https://www.minfytech.com/blogs/graph-neural-networks-for-financial-fraud-detection ↩
-
https://blogs.oracle.com/database/post/graph-database-use-cases-for-financial-services-companies ↩
-
https://resources.nvidia.com/en-us-financial-services-industry/optimizing-fraud-det ↩
-
https://www.graphable.ai/blog/graph-database-fraud-detection/ ↩
-
https://cacm.acm.org/blogcacm/leveraging-graph-databases-for-fraud-detection-in-financial-systems/ ↩
-
https://www.yfiles.com/resources/how-to/visualizing-first-party-bank-fraud ↩