- Irvine, CA
PriceSpider is a retail technology company filled with talented people relentlessly driven to revolutionize the online shopping experience. We are the fastest growing Brand Integrity, Where-to-Buy and data services innovator, providing unmatched insights into online consumer purchasing behaviors around the globe. Our technology helps manufacturers, marketers, and retailers radically improve their marketing impact, retail sales, and revenues. Our clients use PriceSpider’s proprietary “spidering” technology to crawl the web and power their tools to reveal the secrets of exactly what people buy—as well as where, when, and how. We continue to push the boundaries of our technology to create amazing user experiences for both our clients and their consumers. Today PriceSpider is helping nearly 500 brands around the globe.
PriceSpider NOC Technicians will work as a team and collaborate with IT, Engineering and Operations teams to optimize both infrastructure and application-layer performance of the “PriceSpider Platform”.The PriceSpider Platform is made up of a number of working components (e.g. queues, databases, file storage devices, API services, application processes, Docker and virtualized instances, and cloud vendors) to perform server-side web crawling, parsing, data storage, data transformation and reporting. This role works as part of a talented team of systems administrators and application technicians that demonstrate superb troubleshooting and real-time triage of complex systems, to maximize workload throughput and avert application-tier outages. Operates on mission critical systems resources and ensures the highest levels of availability and performance.We are looking for a NOC Technician to join the PriceSpider team in Irvine with the potential for a Swing and Graveyard shift.
Essential Duties and Responsibilities:
- Mission-critical, first response to platform alerts and negative trends.
- Continuously improves monitoring, alerting and visibility into key components of the platform to enhance troubleshooting and optimization efforts.
- Responds to down-trends or threshold alerts on performance metrics, seeking to understand root causes and fastest/safest actions to remediate issues.
- Develops and maintains a play-book of issue-response procedures.
- Provides written incident reports for any impactful incidents, including root cause, timeframes, impacts, remediation actions taken and next-steps.
- Provides recommendations to IT and/or Engineering teams for enhancements, bug fixes, or additional infrastructure in response to discovered performance issues.
- Coordinate and collaborate with IT and Software Engineering team members on tasks and projects and to drive solutions
- During “quiet periods” of system performance, works on a backlog of maturation projects (such as enhancing monitoring/alerting, creating Infrastructure-as-Code scripts to automate component-deployments and scaling, self-healing scripts for various known issues.)
- Demonstrate a strong understanding of issues, including in-depth technical analysis, troubleshooting and resolving root causes; engaging with appropriate SMEs as needed to drive incident and problem management
- Triage and Troubleshooting – Competently troubleshoot issues through Linux/Windows, networking, disk I/O, CPU, memory, queues (MSMQ, RabbitMQ), databases, and application microservices (including reading code to understand what might be causing an issue).
- Learning and Documenting – Able to come up to speed on the interrelated nature of application-layer components as well as the IT infrastructure eco-system that fuels the application functionality, and able to capture and document effective procedures to monitor, troubleshoot and remedy issues.
- Customer Service Attitude – Demonstrates a strong attitude of positive and solution-oriented action and communication that strives for an excellent inter-departmental experience of IT Services.
Required Experience and Education:
- Platforms:Linux, BSD, Windows
- Cloud:Google Cloud Platform, AWS, Azure
- Disk Storage Concepts:SAN (e.g. Compellent), NAS (e.g. FreeNAS), RAID, IOPS
- Networking:TCP/IP, Load Balancing, Reverse Proxy (nginx), IP addressing and routing
- Languages – Coding:Python, Bash, PowerShell, SQL
- Configuration Management or Infrastructure as Code Tools (1+):Ansible, Puppet, Juju, Chef
- Containerization/Virtualization (1+):Helm & Kubernetes, Docker or LXC, OpenStack
- Monitoring/Alerting Systems: LogicMonitor or Graphana
- Education:Demonstrated continuous education in IT and Software fields through active participation in user groups, conferences, open source contribution and/or eLearning.
Additional Beneficial Experience and Education:
- Languages – Comprehension of: C#, Node.js, Go
- Low Level File Systems:ext3/4, zfs, NTFS, hfs+/apfs
- Disk Storage:HDFS, Ceph, GlusterFS
- DevOps:AWS CloudFormation, AWS Lambda, Terraform, Jenkins, Proxmox
- Data Platforms:Microsoft SQL Server (2012-2016), PostgreSQL (9.x-10.x), MongoDB, ElasticSearch, Memcached, Redis
- Education and Certifications:CCNA, RHCE, AWS Certifications, Google Cloud Certification
Monday, December 3, 2018