SYNDX — Web3 Infrastructure & Network Operations
Founder & Technical Operations Lead
At SYNDX, I led the operational management and reliability engineering of a multi-chain Web3 infrastructure environment supporting blockchain-based games, NFT integrations, AI systems, and tokenized assets. My role focused on ensuring production-grade performance, high availability, and secure blockchain network operations across distributed systems.
Blockchain Network Management
Oversaw deployment, optimization, and lifecycle management of blockchain nodes and RPC infrastructure across supported networks. Managed validator connections, full/archive node synchronization, and performance tuning to maintain stable throughput and low-latency JSON-RPC responses for production applications.
Actively monitored node health, peer connectivity, mempool behavior, and block propagation metrics to ensure consistent chain alignment and prevent drift or fork-related issues. Coordinated directly with ecosystem partners and infrastructure providers during upgrades, chain events, or network instability.
Incident & Issue Resolution
Led incident response for complex Web3 production issues, including:
-
RPC latency spikes
-
Node desynchronization
-
Smart contract execution failures
-
Transaction broadcast inconsistencies
-
Infrastructure scaling bottlenecks
Analyzed system logs, node telemetry, and JSON-RPC request/response payloads to isolate root causes. Implemented structured post-mortem documentation to reduce repeat incidents and improve long-term system resilience.
Worked cross-functionally with engineering, support, and external protocol teams to rapidly triage and resolve production-impacting events. Participated in structured on-call rotations to ensure 24/7 infrastructure reliability.
Monitoring & Alerting
Designed and maintained proactive monitoring systems to identify issues before they impacted users.
Implemented dashboards and alerting pipelines using tools such as:
-
Grafana for metrics visualization
-
DataDog for infrastructure and application observability
-
Custom log aggregation and alert thresholds
Tracked key performance indicators including:
-
RPC success rates
-
Node CPU/memory utilization
-
Chain sync status
-
Error rate thresholds
-
Request latency distributions
Built actionable alert policies to minimize noise while ensuring high-severity incidents were surfaced immediately.
SLO/SLA Management
Defined service-level objectives (SLOs) and reliability benchmarks aligned with platform growth and user demand. Established measurable uptime targets, response time guarantees, and incident response timelines.
Enforced SLA compliance through continuous monitoring, automated reporting, and operational reviews. Reduced downtime risk by introducing structured reliability frameworks and documented escalation paths.
Automation & Optimization
Improved infrastructure consistency and scalability by implementing automation across environments.
Utilized infrastructure-as-code and orchestration tools such as:
-
Terraform for reproducible infrastructure provisioning
-
Kubernetes for container orchestration and scaling
-
Ansible for configuration management and deployment consistency
Automated repetitive operational tasks including node deployments, health checks, and recovery workflows. Reduced manual intervention and improved deployment speed while maintaining security standards.
Collaboration & Support
Worked closely with Tier-1 support to escalate and resolve production-impacting issues efficiently. Provided technical guidance to developers integrating blockchain services into applications.
Collaborated with product, engineering, and infrastructure teams to align operational reliability with roadmap milestones and scaling demands. Maintained clear technical documentation to ensure cross-team visibility and long-term knowledge retention.





