- Email: [email protected]
- LinkedIn: aleksandr-kedrik
- GitHub: alkedr
- TL;DR: An experienced software engineer looking to transition from working on LLM training infrastructure to AGI safety research.
- Apr 2024 – present
- Sped up model checkpoint uploading during training 2x to make it possible to make checkpoints more often and lose less progress after failures.
- Wrote a tool for benchmarking different methods of checkpoint uploading.
- Sept 2023 – Apr 2024
- Designed and implemented infrastructure for a real-time video processing pipeline for tracking player movements on a football field. Python, PostgreSQL, a message queue similar to Kafka, FFmpeg.
- Wrote a tool for visualizing the progress of each video chunk in real-time.
- Nov 2022 – Sept 2023
- Sped up video 3D reconstruction pipeline 4x by finding and fixing infrastructural issues and profiling with Torch profiler.
- Researched and tested alternatives to the differentiable 3D-renderer that we were using. Nvdiffrast, Mitsuba 3, Pytorch3d.
- Wrote a tool for converting reconstructed animated 3D model back into video for easy viewing.
- May 2020 – Nov 2022
- Designed and implemented a task queue service for launching simulator tests that could scale to thousands of workers. Python, PostgreSQL.
- Designed, prototyped, and helped implement an inference server cluster that could handle to >12k RPS and tens of GBit/s of incoming traffic. Python, C++, Nvidia Triton Inference Server.
- June 2017 – May 2020
- Worked on a CI/CD web service and a log parsing service. Java, MongoDB.
- Aug 2013 – June 2017
Saint-Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO), BS in Computer Science
- Sept 2011 – June 2015
- Technologies: Python, C++, PyTorch, Jupyter, Cursor