parses paypal soap logs

Add C++ implementation of transaction parser

- Implement transaction parser in C++ for better performance
- Add CMake build system for cross-platform compatibility
- Create nix flake for reproducible builds
- Optimize XML parsing with regex for faster processing

💙 Generated with Crush
Co-Authored-By: 💙 Crush <crush@charm.land>

dunkirk.sh a4e67237

verified
+23
CMakeLists.txt
···
+
cmake_minimum_required(VERSION 3.10)
+
project(soapdump VERSION 0.1.0 LANGUAGES CXX)
+
+
# Set C++ standard
+
set(CMAKE_CXX_STANDARD 17)
+
set(CMAKE_CXX_STANDARD_REQUIRED ON)
+
set(CMAKE_CXX_EXTENSIONS OFF)
+
+
# Add executable
+
add_executable(transaction-parser src/transaction-parser.cpp)
+
+
# Set compiler flags
+
target_compile_options(transaction-parser PRIVATE
+
-Wall
+
-Wextra
+
-Wpedantic
+
-O3
+
)
+
+
# Install target
+
install(TARGETS transaction-parser
+
RUNTIME DESTINATION bin
+
)
+36
CRUSH.md
···
+
# CRUSH Development Guidelines
+
+
## Commands
+
- **Run parser**: `./transaction-parser.sh <logfile>`
+
- **Test parser**: `./transaction-parser.sh -s <logfile>` (summary mode)
+
- **Help**: `./transaction-parser.sh --help`
+
+
## Code Style
+
- **Language**: Bash scripting
+
- **Shebang**: Use `#!/usr/bin/env nix-shell` with dependencies
+
- **Formatting**:
+
- 4-space indentation
+
- Functions use snake_case
+
- Variables use UPPER_CASE
+
- **Structure**:
+
- Clear function separation
+
- Help documentation included
+
- Error handling with exit codes
+
- **Dependencies**: gnugrep, gnused, coreutils (via nix-shell)
+
+
## Best Practices
+
- Always validate input files exist
+
- Use proper error messages and exit codes
+
- Include comprehensive help documentation
+
- Follow tab-separated output format for structured data
+
- Handle edge cases in XML parsing
+
+
## Naming Conventions
+
- Variables: UPPER_CASE
+
- Functions: snake_case
+
- Files: kebab-case.sh
+
+
## Error Handling
+
- Check file existence before processing
+
- Validate arguments
+
- Exit with appropriate codes (0 for success, 1 for error)
+27
LICENSE.md
···
+
The MIT License (MIT)
+
=====================
+
+
Copyright © `2025` `Kieran Klukas`
+
+
Permission is hereby granted, free of charge, to any person
+
obtaining a copy of this software and associated documentation
+
files (the “Software”), to deal in the Software without
+
restriction, including without limitation the rights to use,
+
copy, modify, merge, publish, distribute, sublicense, and/or sell
+
copies of the Software, and to permit persons to whom the
+
Software is furnished to do so, subject to the following
+
conditions:
+
+
The above copyright notice and this permission notice shall be
+
included in all copies or substantial portions of the Software.
+
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+
OTHER DEALINGS IN THE SOFTWARE.
+
+
+77
README.md
···
+
# SoapDump
+
+
A high-performance PayPal SOAP log parser written in C++.
+
+
![a media offline screen from davinci resolve](https://hc-cdn.hel1.your-objectstorage.com/s/v3/8744661d883d695adc6b14a17a52ac8970d7c2dd_image.png)
+
+
## Features
+
+
- Fast parsing of PayPal SOAP transaction logs
+
- Structured output for easy analysis
+
- Summary statistics mode
+
- Pipe-friendly output format
+
- Nix flake for reproducible builds
+
+
## Usage
+
+
```bash
+
# Get all transactions
+
./transaction-parser payments.log
+
+
# Get only successful transactions
+
./transaction-parser payments.log | grep Success
+
+
# Count transactions by state
+
./transaction-parser payments.log | cut -d'|' -f8 | sort | uniq -c | sort -nr
+
+
# Find largest transaction
+
./transaction-parser payments.log | sort -t'|' -k2 -nr | head -1
+
+
# Get transactions over $500
+
./transaction-parser payments.log | awk -F'|' '$2 > 500'
+
+
# Summary stats
+
./transaction-parser -s payments.log
+
```
+
+
## Building
+
+
### Using Nix
+
+
```bash
+
# Build the project
+
nix build
+
+
# Run the parser
+
nix run . -- payments.log
+
+
# Development shell
+
nix develop
+
```
+
+
### Using CMake
+
+
```bash
+
# Build the project
+
cmake -B build -S .
+
cmake --build build --config Release
+
+
# Run the parser
+
./build/transaction-parser payments.log
+
```
+
+
## Output Format
+
+
Tab-separated values with the following fields:
+
+
```
+
TRANS_NUM|AMOUNT|CURRENCY|FIRSTNAME|LASTNAME|STREET|CITY|STATE|ZIP|CCTYPE|CCLAST4|EXPMONTH|EXPYEAR|CVV|TRANSID|STATUS|CORRID|PROC_AMOUNT
+
```
+
+
<p align="center">
+
<i><code>&copy 2025-present <a href="https://github.com/taciturnaxolotl">Kieran Klukas</a></code></i>
+
</p>
+
+
<p align="center">
+
<a href="https://github.com/taciturnaxolotl/soapdump/blob/main/LICENSE.md"><img src="https://img.shields.io/static/v1.svg?style=for-the-badge&label=License&message=MIT&logoColor=d9e0ee&colorA=363a4f&colorB=b7bdf8"/></a>
+
</p>
+61
flake.lock
···
+
{
+
"nodes": {
+
"nixpkgs": {
+
"locked": {
+
"lastModified": 1757487488,
+
"narHash": "sha256-zwE/e7CuPJUWKdvvTCB7iunV4E/+G0lKfv4kk/5Izdg=",
+
"owner": "NixOS",
+
"repo": "nixpkgs",
+
"rev": "ab0f3607a6c7486ea22229b92ed2d355f1482ee0",
+
"type": "github"
+
},
+
"original": {
+
"owner": "NixOS",
+
"ref": "nixos-unstable",
+
"repo": "nixpkgs",
+
"type": "github"
+
}
+
},
+
"root": {
+
"inputs": {
+
"nixpkgs": "nixpkgs",
+
"utils": "utils"
+
}
+
},
+
"systems": {
+
"locked": {
+
"lastModified": 1681028828,
+
"narHash": "sha256-Vy1rq5AaRuLzOxct8nz4T6wlgyUR7zLU309k9mBC768=",
+
"owner": "nix-systems",
+
"repo": "default",
+
"rev": "da67096a3b9bf56a91d16901293e51ba5b49a27e",
+
"type": "github"
+
},
+
"original": {
+
"owner": "nix-systems",
+
"repo": "default",
+
"type": "github"
+
}
+
},
+
"utils": {
+
"inputs": {
+
"systems": "systems"
+
},
+
"locked": {
+
"lastModified": 1731533236,
+
"narHash": "sha256-l0KFg5HjrsfsO/JpG+r7fRrqm12kzFHyUHqHCVpMMbI=",
+
"owner": "numtide",
+
"repo": "flake-utils",
+
"rev": "11707dc2f618dd54ca8739b309ec4fc024de578b",
+
"type": "github"
+
},
+
"original": {
+
"owner": "numtide",
+
"repo": "flake-utils",
+
"type": "github"
+
}
+
}
+
},
+
"root": "root",
+
"version": 7
+
}
+68
flake.nix
···
+
{
+
description = "PayPal SOAP Log Parser";
+
+
inputs = {
+
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
+
utils.url = "github:numtide/flake-utils";
+
};
+
+
outputs = { self, nixpkgs, utils }:
+
utils.lib.eachDefaultSystem (system:
+
let
+
pkgs = nixpkgs.legacyPackages.${system};
+
+
# Build dependencies
+
buildInputs = with pkgs; [
+
cmake
+
];
+
+
# Development dependencies
+
nativeBuildInputs = with pkgs; [
+
clang-tools
+
bear
+
];
+
in
+
{
+
packages = rec {
+
soapdump = pkgs.stdenv.mkDerivation {
+
pname = "soapdump";
+
version = "0.1.0";
+
+
src = ./.;
+
+
nativeBuildInputs = [ pkgs.clang ];
+
+
buildPhase = ''
+
mkdir -p bin
+
clang++ -std=c++17 -O3 -o bin/transaction-parser src/transaction-parser.cpp
+
'';
+
+
installPhase = ''
+
mkdir -p $out/bin
+
cp bin/transaction-parser $out/bin/
+
'';
+
};
+
+
default = soapdump;
+
};
+
+
devShells.default = pkgs.mkShell {
+
inherit buildInputs nativeBuildInputs;
+
+
shellHook = ''
+
echo "SoapDump development environment loaded"
+
'';
+
};
+
+
apps = rec {
+
transaction-parser = {
+
type = "app";
+
program = "${self.packages.${system}.default}/bin/transaction-parser";
+
drv = self.packages.${system}.default;
+
};
+
+
default = transaction-parser;
+
};
+
}
+
);
+
}
+353
src/transaction-parser.cpp
···
+
#include <cstdio>
+
#include <iostream>
+
#include <fstream>
+
#include <string>
+
#include <vector>
+
#include <map>
+
#include <regex>
+
#include <algorithm>
+
#include <numeric>
+
#include <iomanip>
+
#include <unordered_map>
+
#include <getopt.h>
+
+
// Transaction data structure
+
struct Transaction {
+
int transNum;
+
std::string amount;
+
std::string currency;
+
std::string firstName;
+
std::string lastName;
+
std::string street;
+
std::string city;
+
std::string state;
+
std::string zip;
+
std::string ccType;
+
std::string ccLast4;
+
std::string expMonth;
+
std::string expYear;
+
std::string cvv;
+
std::string transId;
+
std::string status;
+
std::string corrId;
+
std::string procAmount;
+
};
+
+
// Response data structure
+
struct Response {
+
std::string transId;
+
std::string status;
+
std::string corrId;
+
std::string procAmount;
+
};
+
+
// Function prototypes
+
void showHelp(const char* programName);
+
std::string extractXmlValue(const std::string& xml, const std::string& tag);
+
std::string extractXmlAttribute(const std::string& xml, const std::string& attribute);
+
std::vector<std::string> extractRequests(const std::string& logContent);
+
std::vector<std::string> extractResponses(const std::string& logContent);
+
std::vector<Response> parseResponses(const std::vector<std::string>& responseXmls);
+
std::vector<Transaction> parseTransactions(const std::vector<std::string>& requestXmls, const std::vector<Response>& responses);
+
void outputRawData(const std::vector<Transaction>& transactions);
+
void outputSummary(const std::vector<Transaction>& transactions);
+
+
int main(int argc, char* argv[]) {
+
// Default options
+
bool summaryOnly = false;
+
std::string logFile;
+
+
// Parse command line options
+
static struct option longOptions[] = {
+
{"help", no_argument, 0, 'h'},
+
{"summary", no_argument, 0, 's'},
+
{"raw", no_argument, 0, 'r'},
+
{0, 0, 0, 0}
+
};
+
+
int optionIndex = 0;
+
int opt;
+
while ((opt = getopt_long(argc, argv, "hsr", longOptions, &optionIndex)) != -1) {
+
switch (opt) {
+
case 'h':
+
showHelp(argv[0]);
+
return 0;
+
case 's':
+
summaryOnly = true;
+
break;
+
case 'r':
+
summaryOnly = false;
+
break;
+
case '?':
+
std::cerr << "Unknown option: " << static_cast<char>(optopt) << std::endl;
+
showHelp(argv[0]);
+
return 1;
+
default:
+
break;
+
}
+
}
+
+
// Get logfile name
+
if (optind < argc) {
+
logFile = argv[optind];
+
} else {
+
std::cerr << "Error: No logfile specified" << std::endl;
+
showHelp(argv[0]);
+
return 1;
+
}
+
+
// Check if file exists
+
std::ifstream file(logFile);
+
if (!file.is_open()) {
+
std::cerr << "Error: File '" << logFile << "' not found" << std::endl;
+
return 1;
+
}
+
+
// Read the entire file
+
std::string logContent((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>());
+
file.close();
+
+
// Extract requests and responses
+
std::vector<std::string> requestXmls = extractRequests(logContent);
+
std::vector<std::string> responseXmls = extractResponses(logContent);
+
+
// Parse responses
+
std::vector<Response> responses = parseResponses(responseXmls);
+
+
// Parse transactions
+
std::vector<Transaction> transactions = parseTransactions(requestXmls, responses);
+
+
// Output data
+
if (summaryOnly) {
+
outputSummary(transactions);
+
} else {
+
outputRawData(transactions);
+
}
+
+
return 0;
+
}
+
+
void showHelp(const char* programName) {
+
std::cout << "PayPal SOAP Log Parser\n\n";
+
std::cout << "USAGE:\n";
+
std::cout << " " << programName << " [OPTIONS] <logfile>\n\n";
+
std::cout << "OPTIONS:\n";
+
std::cout << " -h, --help Show this help message\n";
+
std::cout << " -s, --summary Show summary statistics only\n";
+
std::cout << " -r, --raw Output raw structured data (default)\n\n";
+
std::cout << "OUTPUT FORMAT (tab-separated):\n";
+
std::cout << " TRANS_NUM|AMOUNT|CURRENCY|FIRSTNAME|LASTNAME|STREET|CITY|STATE|ZIP|CCTYPE|CCLAST4|EXPMONTH|EXPYEAR|CVV|TRANSID|STATUS|CORRID|PROC_AMOUNT\n\n";
+
std::cout << "FIELD DESCRIPTIONS:\n";
+
std::cout << " TRANS_NUM - Transaction sequence number\n";
+
std::cout << " AMOUNT - Order total amount\n";
+
std::cout << " CURRENCY - Currency code (USD, etc)\n";
+
std::cout << " FIRSTNAME - Customer first name\n";
+
std::cout << " LASTNAME - Customer last name\n";
+
std::cout << " STREET - Street address\n";
+
std::cout << " CITY - City name\n";
+
std::cout << " STATE - State/Province code\n";
+
std::cout << " ZIP - Postal code\n";
+
std::cout << " CCTYPE - Credit card type (Visa, MasterCard, etc)\n";
+
std::cout << " CCLAST4 - Last 4 digits of credit card\n";
+
std::cout << " EXPMONTH - Card expiration month\n";
+
std::cout << " EXPYEAR - Card expiration year\n";
+
std::cout << " CVV - CVV code\n";
+
std::cout << " TRANSID - PayPal transaction ID\n";
+
std::cout << " STATUS - Transaction status (Success/Failure)\n";
+
std::cout << " CORRID - Correlation ID\n";
+
std::cout << " PROC_AMOUNT - Actually processed amount\n\n";
+
std::cout << "EXAMPLES:\n";
+
std::cout << " # Get all transactions\n";
+
std::cout << " " << programName << " payments.log\n\n";
+
std::cout << " # Get only successful transactions\n";
+
std::cout << " " << programName << " payments.log | grep Success\n\n";
+
std::cout << " # Count transactions by state\n";
+
std::cout << " " << programName << " payments.log | cut -d'|' -f8 | sort | uniq -c | sort -nr\n\n";
+
std::cout << " # Find largest transaction\n";
+
std::cout << " " << programName << " payments.log | sort -t'|' -k2 -nr | head -1\n\n";
+
std::cout << " # Get transactions over $500\n";
+
std::cout << " " << programName << " payments.log | awk -F'|' '$2 > 500'\n\n";
+
std::cout << " # Summary stats\n";
+
std::cout << " " << programName << " -s payments.log\n";
+
}
+
+
std::string extractXmlValue(const std::string& xml, const std::string& tag) {
+
std::regex pattern("<" + tag + "(?:[^>]*)>([^<]*)</" + tag + ">");
+
std::smatch match;
+
if (std::regex_search(xml, match, pattern) && match.size() > 1) {
+
return match[1].str();
+
}
+
return "";
+
}
+
+
std::string extractXmlAttribute(const std::string& xml, const std::string& attribute) {
+
std::regex pattern(attribute + "=\"([^\"]*)\"");
+
std::smatch match;
+
if (std::regex_search(xml, match, pattern) && match.size() > 1) {
+
return match[1].str();
+
}
+
return "";
+
}
+
+
std::vector<std::string> extractRequests(const std::string& logContent) {
+
std::vector<std::string> requests;
+
std::regex pattern("PPAPIService: Request: (.*)");
+
+
std::string::const_iterator searchStart(logContent.cbegin());
+
std::smatch match;
+
while (std::regex_search(searchStart, logContent.cend(), match, pattern)) {
+
if (match.size() > 1) {
+
requests.push_back(match[1].str());
+
}
+
searchStart = match.suffix().first;
+
}
+
+
return requests;
+
}
+
+
std::vector<std::string> extractResponses(const std::string& logContent) {
+
std::vector<std::string> responses;
+
std::regex pattern("PPAPIService: Response: <\\?.*\\?>(.*)");
+
+
std::string::const_iterator searchStart(logContent.cbegin());
+
std::smatch match;
+
while (std::regex_search(searchStart, logContent.cend(), match, pattern)) {
+
if (match.size() > 1) {
+
responses.push_back(match[1].str());
+
}
+
searchStart = match.suffix().first;
+
}
+
+
return responses;
+
}
+
+
std::vector<Response> parseResponses(const std::vector<std::string>& responseXmls) {
+
std::vector<Response> responses;
+
+
for (const auto& xml : responseXmls) {
+
Response response;
+
response.transId = extractXmlValue(xml, "TransactionID");
+
response.status = extractXmlValue(xml, "Ack");
+
response.corrId = extractXmlValue(xml, "CorrelationID");
+
response.procAmount = extractXmlValue(xml, "Amount");
+
+
responses.push_back(response);
+
}
+
+
return responses;
+
}
+
+
std::vector<Transaction> parseTransactions(const std::vector<std::string>& requestXmls, const std::vector<Response>& responses) {
+
std::vector<Transaction> transactions;
+
int transNum = 1;
+
+
for (size_t i = 0; i < requestXmls.size(); ++i) {
+
const auto& xml = requestXmls[i];
+
+
Transaction transaction;
+
transaction.transNum = transNum++;
+
+
// Extract request fields
+
transaction.amount = extractXmlValue(xml, "ebl:OrderTotal");
+
transaction.currency = extractXmlAttribute(xml, "currencyID");
+
transaction.firstName = extractXmlValue(xml, "ebl:FirstName");
+
transaction.lastName = extractXmlValue(xml, "ebl:LastName");
+
transaction.street = extractXmlValue(xml, "ebl:Street1");
+
transaction.city = extractXmlValue(xml, "ebl:CityName");
+
transaction.state = extractXmlValue(xml, "ebl:StateOrProvince");
+
transaction.zip = extractXmlValue(xml, "ebl:PostalCode");
+
transaction.ccType = extractXmlValue(xml, "ebl:CreditCardType");
+
transaction.ccLast4 = extractXmlValue(xml, "ebl:CreditCardLastFourDigits");
+
transaction.expMonth = extractXmlValue(xml, "ebl:ExpMonth");
+
transaction.expYear = extractXmlValue(xml, "ebl:ExpYear");
+
transaction.cvv = extractXmlValue(xml, "ebl:CVV2");
+
+
// Get corresponding response data
+
if (i < responses.size()) {
+
transaction.transId = responses[i].transId;
+
transaction.status = responses[i].status;
+
transaction.corrId = responses[i].corrId;
+
transaction.procAmount = responses[i].procAmount;
+
}
+
+
transactions.push_back(transaction);
+
}
+
+
return transactions;
+
}
+
+
void outputRawData(const std::vector<Transaction>& transactions) {
+
for (const auto& t : transactions) {
+
std::cout << t.transNum << "|"
+
<< t.amount << "|"
+
<< t.currency << "|"
+
<< t.firstName << "|"
+
<< t.lastName << "|"
+
<< t.street << "|"
+
<< t.city << "|"
+
<< t.state << "|"
+
<< t.zip << "|"
+
<< t.ccType << "|"
+
<< t.ccLast4 << "|"
+
<< t.expMonth << "|"
+
<< t.expYear << "|"
+
<< t.cvv << "|"
+
<< t.transId << "|"
+
<< t.status << "|"
+
<< t.corrId << "|"
+
<< t.procAmount << std::endl;
+
}
+
}
+
+
void outputSummary(const std::vector<Transaction>& transactions) {
+
std::cout << "=== SUMMARY ===" << std::endl;
+
+
// Count transactions
+
int total = transactions.size();
+
int successful = std::count_if(transactions.begin(), transactions.end(),
+
[](const Transaction& t) { return t.status == "Success"; });
+
+
std::cout << "Total Transactions: " << total << std::endl;
+
std::cout << "Successful: " << successful << std::endl;
+
std::cout << "Failed: " << (total - successful) << std::endl;
+
std::cout << std::endl;
+
+
// Top 5 states
+
std::map<std::string, int> stateCounts;
+
for (const auto& t : transactions) {
+
stateCounts[t.state]++;
+
}
+
+
std::cout << "Top 5 States by Transaction Count:" << std::endl;
+
std::vector<std::pair<std::string, int>> stateCountVec(stateCounts.begin(), stateCounts.end());
+
std::sort(stateCountVec.begin(), stateCountVec.end(),
+
[](const auto& a, const auto& b) { return a.second > b.second; });
+
+
int count = 0;
+
for (const auto& sc : stateCountVec) {
+
if (count++ >= 5) break;
+
std::cout << " " << sc.first << ": " << sc.second << std::endl;
+
}
+
std::cout << std::endl;
+
+
// Transaction amount stats
+
std::vector<double> amounts;
+
for (const auto& t : transactions) {
+
try {
+
amounts.push_back(std::stod(t.amount));
+
} catch (...) {
+
// Skip invalid amounts
+
}
+
}
+
+
if (!amounts.empty()) {
+
double totalAmount = std::accumulate(amounts.begin(), amounts.end(), 0.0);
+
double largest = *std::max_element(amounts.begin(), amounts.end());
+
double smallest = *std::min_element(amounts.begin(), amounts.end());
+
+
std::cout << "Transaction Amount Stats:" << std::endl;
+
std::cout << " Total: $" << std::fixed << std::setprecision(2) << totalAmount << std::endl;
+
std::cout << " Largest: $" << std::fixed << std::setprecision(2) << largest << std::endl;
+
std::cout << " Smallest: $" << std::fixed << std::setprecision(2) << smallest << std::endl;
+
}
+
}