Skip to content

Commit

Permalink
Update Photo and CV
Browse files Browse the repository at this point in the history
  • Loading branch information
Orion-Zheng committed Jan 5, 2024
1 parent 4440fe1 commit e351168
Show file tree
Hide file tree
Showing 8 changed files with 71 additions and 89 deletions.
4 changes: 2 additions & 2 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

# Site Settings
locale : "en-US"
title : "Zian(Andy) Zheng's Homepage"
title : "Zian(Andy) Zheng"
title_separator : "-"
name : &name "Zian(Andy) Zheng"
description : &description "personal description"
Expand Down Expand Up @@ -81,7 +81,7 @@ analytics:
# Site Author
author:
name : "Zian(Andy) Zheng"
avatar : "avatar.png"
avatar : "avatar.jpg"
bio : "Master Student at HPC-AI Lab, NUS"
location : "Singapore"
employer :
Expand Down
24 changes: 12 additions & 12 deletions _data/navigation.yml
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
# main links links
main:
- title: "Publications"
url: /publications/
# - title: "Publications"
# url: /publications/

- title: "Talks"
url: /talks/
# - title: "Talks"
# url: /talks/

- title: "Teaching"
url: /teaching/
# - title: "Teaching"
# url: /teaching/

- title: "Portfolio"
url: /portfolio/
# - title: "Portfolio"
# url: /portfolio/

- title: "Blog Posts"
url: /year-archive/
# - title: "Blog Posts"
# url: /year-archive/

- title: "CV"
url: /cv/

- title: "Guide"
url: /markdown/
# - title: "Guide"
# url: /markdown/
47 changes: 12 additions & 35 deletions _pages/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,44 +7,21 @@ redirect_from:
- /about/
- /about.html
---
Hello there! I’m Zian(Andy) Zheng, an AI enthusiast on a quest for challenges and unexplored horizons. As a Master’s student, I’m always eager to push boundaries and embrace novel ideas. Beyond the tech realm, I thrive on adrenaline-fueled outdoor pursuits like skydiving, scuba diving, free diving, climbing, and kayaking. Whether it’s conquering algorithms or mountains, I relish every opportunity to grow, learn, and adapt.

This is the front page of a website that is powered by the [academicpages template](https://github.com/academicpages/academicpages.github.io) and hosted on GitHub pages. [GitHub pages](https://pages.github.com) is a free service in which websites are built and hosted from code and data stored in a GitHub repository, automatically updating when a new commit is made to the respository. This template was forked from the [Minimal Mistakes Jekyll Theme](https://mmistakes.github.io/minimal-mistakes/) created by Michael Rose, and then extended to support the kinds of content that academics have: publications, talks, teaching, a portfolio, blog posts, and a dynamically-generated CV. You can fork [this repository](https://github.com/academicpages/academicpages.github.io) right now, modify the configuration and markdown files, add your own PDFs and other content, and have your own site for free, with no ads! An older version of this template powers my own personal website at [stuartgeiger.com](http://stuartgeiger.com), which uses [this Github repository](https://github.com/staeiou/staeiou.github.io).

A data-driven personal website
======
Like many other Jekyll-based GitHub Pages templates, academicpages makes you separate the website's content from its form. The content & metadata of your website are in structured markdown files, while various other files constitute the theme, specifying how to transform that content & metadata into HTML pages. You keep these various markdown (.md), YAML (.yml), HTML, and CSS files in a public GitHub repository. Each time you commit and push an update to the repository, the [GitHub pages](https://pages.github.com/) service creates static HTML pages based on these files, which are hosted on GitHub's servers free of charge.

Many of the features of dynamic content management systems (like Wordpress) can be achieved in this fashion, using a fraction of the computational resources and with far less vulnerability to hacking and DDoSing. You can also modify the theme to your heart's content without touching the content of your site. If you get to a point where you've broken something in Jekyll/HTML/CSS beyond repair, your markdown files describing your talks, publications, etc. are safe. You can rollback the changes or even delete the repository and start over -- just be sure to save the markdown files! Finally, you can also write scripts that process the structured data on the site, such as [this one](https://github.com/academicpages/academicpages.github.io/blob/master/talkmap.ipynb) that analyzes metadata in pages about talks to display [a map of every location you've given a talk](https://academicpages.github.io/talkmap.html).

Getting started
======
1. Register a GitHub account if you don't have one and confirm your e-mail (required!)
1. Fork [this repository](https://github.com/academicpages/academicpages.github.io) by clicking the "fork" button in the top right.
1. Go to the repository's settings (rightmost item in the tabs that start with "Code", should be below "Unwatch"). Rename the repository "[your GitHub username].github.io", which will also be your website's URL.
1. Set site-wide configuration and create content & metadata (see below -- also see [this set of diffs](http://archive.is/3TPas) showing what files were changed to set up [an example site](https://getorg-testacct.github.io) for a user with the username "getorg-testacct")
1. Upload any files (like PDFs, .zip files, etc.) to the files/ directory. They will appear at https://[your GitHub username].github.io/files/example.pdf.
1. Check status by going to the repository settings, in the "GitHub pages" section

Site-wide configuration
Research:
------
The main configuration file for the site is in the base directory in [_config.yml](https://github.com/academicpages/academicpages.github.io/blob/master/_config.yml), which defines the content in the sidebars and other site-wide features. You will need to replace the default variables with ones about yourself and your site's github repository. The configuration file for the top menu is in [_data/navigation.yml](https://github.com/academicpages/academicpages.github.io/blob/master/_data/navigation.yml). For example, if you don't have a portfolio or blog posts, you can remove those items from that navigation.yml file to remove them from the header.
I have a deep interest in the Large Language Model (LLM), especially in the following areas:
- Data-Centric Approaches: Focusing on data quality and data strategies (e.g. Data Mixture and Data Curriculum)
- Efficient LLM Design and Training: MoE (Mixture-of-Experts) Model, Efficient Context Extrapolation Method.
- Maximizing Trained LLM Availability/Capability: Efficient Inference, Prompt Engineering, LLM-based Agent.

Create content & metadata
------
For site content, there is one markdown file for each type of content, which are stored in directories like _publications, _talks, _posts, _teaching, or _pages. For example, each talk is a markdown file in the [_talks directory](https://github.com/academicpages/academicpages.github.io/tree/master/_talks). At the top of each markdown file is structured data in YAML about the talk, which the theme will parse to do lots of cool stuff. The same structured data about a talk is used to generate the list of talks on the [Talks page](https://academicpages.github.io/talks), each [individual page](https://academicpages.github.io/talks/2012-03-01-talk-1) for specific talks, the talks section for the [CV page](https://academicpages.github.io/cv), and the [map of places you've given a talk](https://academicpages.github.io/talkmap.html) (if you run this [python file](https://github.com/academicpages/academicpages.github.io/blob/master/talkmap.py) or [Jupyter notebook](https://github.com/academicpages/academicpages.github.io/blob/master/talkmap.ipynb), which creates the HTML for the map based on the contents of the _talks directory).
![Overview of My Research Interest](/images/research_interests.png)

**Markdown generator**

I have also created [a set of Jupyter notebooks](https://github.com/academicpages/academicpages.github.io/tree/master/markdown_generator
) that converts a CSV containing structured data about talks or presentations into individual markdown files that will be properly formatted for the academicpages template. The sample CSVs in that directory are the ones I used to create my own personal website at stuartgeiger.com. My usual workflow is that I keep a spreadsheet of my publications and talks, then run the code in these notebooks to generate the markdown files, then commit and push them to the GitHub repository.

How to edit your site's GitHub repository
Personal Information
------
Many people use a git client to create files on their local computer and then push them to GitHub's servers. If you are not familiar with git, you can directly edit these configuration and markdown files directly in the github.com interface. Navigate to a file (like [this one](https://github.com/academicpages/academicpages.github.io/blob/master/_talks/2012-03-01-talk-1.md) and click the pencil icon in the top right of the content preview (to the right of the "Raw | Blame | History" buttons). You can delete a file by clicking the trashcan icon to the right of the pencil icon. You can also create new files or upload files by navigating to a directory and clicking the "Create new file" or "Upload files" buttons.
Here are some interesting facts about me:
- I have the chance to be Newton's 18th generation of students. The acadamic family tree is [here](/files/academic_family_tree.md).
- I am an extreme sports enthusiast. You can call me 'Tri-diver'(skydiver, freediver, scuba diver), as this represents my achievements in scuba diving, free diving, and skydiving.
- I write poems about life and love(in English/Chinese). For example, [here](/images/poem.png) is one of my poems about 'Tri-diver'.

Example: editing a markdown file for a talk
![Editing a markdown file for a talk](/images/editing-talk.png)

For more info
------
More info about configuring academicpages can be found in [the guide](https://academicpages.github.io/markdown/). The [guides for the Minimal Mistakes theme](https://mmistakes.github.io/minimal-mistakes/docs/configuration/) (which this theme was forked from) might also be helpful.
67 changes: 27 additions & 40 deletions _pages/cv.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,49 +11,36 @@ redirect_from:

Education
======
* B.S. in GitHub, GitHub University, 2012
* M.S. in Jekyll, GitHub University, 2014
* Ph.D in Version Control Theory, GitHub University, 2018 (expected)
* **B.Eng. in Data Science, Lanzhou University(LZU)**, 09/2018 - 07/2022
* **GPA:** 92.8/100, **Ranking:** 1/192
* **Honors:**
* China National Scholarship (Top 0.1% Across Nation)
* Merit Student in Gansu Province (Top 1% Across Province)
* **M.Comp. in Artificial Intelligence, National University of Singapore(NUS)**, 08/2022 - Now
* **GPA:** 4.42/5, **Supervisor:** Prof. [Yang You](https://www.comp.nus.edu.sg/~youy/) (Director and PI of HPC-AI Lab)

Work experience
Research Experience
======
* Summer 2015: Research Assistant
* Github University
* Duties included: Tagging issues
* Supervisor: Professor Git

* Fall 2015: Research Assistant
* Github University
* Duties included: Merging pull requests
* Supervisor: Professor Hub

Skills
* **Master Dissertaion in HPC-AI Lab, National University of Singapore**, 05/2023 – Now
* Working on OpenMoE project (second author) with [Fuzhao Xue](https://xuefuzhao.github.io), which is the **first open-source, decoder-only MoE language model**. We released the code and checkpoint and got ~**750 stars** on [GitHub](https://github.com/XueFuzhao/OpenMoE) and ~**500 likes** on [Twitter](https://twitter.com/xuefz/status/1693696988611739947?s=61&t=shUN33SHHFV3CuEuz26WcA).
* Investigated publicly available pre-training corpus (English, Chinese, multilingual, code, etc), preprocessing methods and tokenization techniques. Do experiments comparing tokenizers. Prepare the pre-training, SFT and evaluation datasets in TFDS format.
* Participated in the Pytorch Implementation of OpenMoE. Now conducting literature review of Mixture of Experts models and writing the paper.


Work Experience
======
* Skill 1
* Skill 2
* Sub-skill 2.1
* Sub-skill 2.2
* Sub-skill 2.3
* Skill 3

* **Artificial Intelligence Engineer Intern, Beijing, HPC-AI Tech**, 07/2023 – 11/2023

**Keywords:** Data-Centric methods, Long Context LLM, Retrieval Augmented Generation

* Extended the LLaMA’s vocabulary for Chinese text and participated in the data cleaning and preparation process in the [Colossal-LLaMA-2 project](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base) (achieved **186,000 downloads** on Huggingface so far).
* Context length extrapolation: Investigated common context extrapolation techniques (e.g. PI, NTK, LongLLaMA , LongLoRA, etc.), training corpus with long data and long-text evaluation methods. Working on constructing Chinese long text training data and doing multi-GPU training to extrapolate Colossal-LLaMA-2.
* Participated in the [ColossalQA](https://github.com/hpcaitech/ColossalAI/tree/main/applications/ColossalQA) project, a retrieval-based QA framework based on Langchain.
* Involved in the writing of the book 'Practical Large AI Models', edited by Professor Yang You.

Publications
======
<ul>{% for post in site.publications %}
{% include archive-single-cv.html %}
{% endfor %}</ul>

Talks
======
<ul>{% for post in site.talks %}
{% include archive-single-talk-cv.html %}
{% endfor %}</ul>

Teaching
======
<ul>{% for post in site.teaching %}
{% include archive-single-cv.html %}
{% endfor %}</ul>

Service and leadership
======
* Currently signed in to 43 different slack teams
* OpenMoE: Open Mixture-of-Experts Language Models [[Code]](https://github.com/XueFuzhao/OpenMoE) [[Blog]](https://www.notion.so/Aug-2023-OpenMoE-v0-2-Release-43808efc0f5845caa788f2db52021879) [[Twitter]](https://x.com/XueFz/status/1693696988611739947?s=20) \
Fuzhao Xue, **Zian Zheng**, Yao Fu, Jinjie Ni, Zangwei Zheng, Wangchunshu Zhou and Yang You
***GitHub repository***
18 changes: 18 additions & 0 deletions files/academic_family_tree.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
- Isaac Newton, M.A. University of Cambridge 1668
- Roger Cotes, M.A. University of Cambridge 1706
- Robert Smith, M.A. University of Cambridge 1715
- Walter Taylor, M.A. University of Cambridge 1723
- Stephen Whisson, M.A. University of Cambridge 1742
- Thomas Postlethwaite, M.A. University of Cambridge 1756
- Thomas Jones, M.A. University of Cambridge 1782
- Adam Sedgwick, M.A. University of Cambridge 1811
- William Hopkins, M.A. University of Cambridge 1830
- Arthur Cayley, Ph.D. University of Oxford 1875
- Andrew Forsyth, Ph.D. University of Cambridge 1881
- Edmund Whittaker, M.A. University of Cambridge 1895
- John Synge, D.Sc. Trinity College, Dublin 1926
- Byron Griffith, Ph.D. University of Toronto 1937
- William Kahan, Ph.D. University of Toronto 1958
- James Demmel, Ph.D. University of California, Berkeley 1983
- Yang You, Ph.D. University of California, Berkeley 2020
- Zian Zheng, M.Comp. National University of Singapore 2024 (Expected)
File renamed without changes
Binary file added images/poem.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/research_interests.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit e351168

Please sign in to comment.