Hardware Systems Engineer

RemoteMid-level
🇬🇧 United Kingdom
Systems Developer
Technology

About the department

Cloudflare’s Infrastructure group is responsible for building our global network. Our Hardware Engineering team helps research, develop, test, and deploy new equipment enabling 20% of the world’s internet traffic to be served smoothly. Deployed across 285 cities in 100+ countries, the hardware we select helps improve the security, reliability, and performance of the Internet.

About the Role

We need to make thoughtful infrastructure choices affecting a significant portion of the Internet. Hardware we work with includes servers, routers, switches, optical equipment, power distribution units, cables, optics, and more. As a Hardware Systems Engineer, you will work with colleagues on the Hardware Engineering, Product teams, and Hardware Sourcing teams to troubleshoot and maintain Cloudflare’s worldwide fleet of storage and compute servers.

What you'll do

  • Develop and maintain automation tools to update firmware on servers and components in Cloudflare’s fleet
  • Work with software teams to validate bug fixes and performance of new firmware revisions
  • Test and deploy firmware updates to the fleet, monitoring the progress of the rollout for compliance and reliability
  • Work with server and component vendors to obtain, debug, and maintain the latest updates
  • Work with our Site Reliability Engineering teams to triage bug reports
  • Support our Data Centre Engineering teams in resolving hardware issues
  • Communicate your results and updates through blog posts, internal talks, and tickets

Examples of desirable skills, knowledge and experience

  • Bachelor’s degree in Computer Engineering, Electrical Engineering, or Computer Science
  • Desire to learn about the Cloudflare hardware used by almost 20% of all web sites
  • Desire to learn how a diverse server fleet is managed at scale
  • Desire to learn the tools Cloudflare uses to maintain and monitor our hardware
  • Knowledge of PXE booting
  • Knowledge of configuration management, in particular we use salt to manage our fleet
  • Knowledge of Redfish, IPMI and server remote management protocols
  • Knowledge of running production mission critical systems

Bonus Points

  • Familiarity with server hardware architecture
  • Knowledge of debugging server hardware faults and the ability to engage with our sourcing team and vendors to improve quality
  • Experience of managing large fleets comprising of thousands of servers
  • Experience of observability and monitoring tools such as Prometheus and Grafana, and the ability to observe trends over time
  • Experience scripting and programming, in particular python and bash
  • Experience with software development tools and processes such as git, Bitbucket and TeamCity and Jira

 

 Cloudflare

Cloudflare

Cloudflare is a highly ambitious, large-scale technology company with a soul, committed to building a better Internet and protecting the free and open Internet.

Cloud Computing
Technology
Large Enterprise

LinkedIn

🏭computer and network security
🎂2009

Other jobs at Cloudflare

 

 

 

 

 

 

 

 

View all Cloudflare jobs

Notifications about similar jobs

Get notifications to your inbox about new jobs that are similar to this one.

🇬🇧 United Kingdom
Systems Developer
Remote

No spam. No ads. Unsubscribe anytime.

Similar jobs