GitHub - SJTU-TES/awesome-sjtu-tes: SJTU Technology Engage Square

Introduction

In recent years, technology has been increasingly subdivided into numerous fields, each of which has witnessed a proliferation of remarkable achievements. However, limited time and resources often confine individuals to focus on only a few domains, or even specific branches within a particular field.

In order to alleviate the impact of rising barriers in various domains on beginners, and to enable them to quickly experiment and set up application environments, we have developed the SJTU-TES (Shanghai Jiao Tong University Technology Engagement Square) platform.

Through this platform, users can gain insights into cutting-edge research across different domains and conveniently establish development or experimental environments using our interactive space , reproducible repositories , and testing datasets .

🔥 We mark work contributed by SJTU-TES with ⭐.

🔥 We have provided a demonstration video of the sjtu_tes space here.

🔥 We provide Chinese Requirement Document, Design Document, Testing Document and Deployment Document

🔥 We primarily use the following icons to indicate the organization of each repository.

Paper Link

The corresponding published paper of the work, where "xxxx" refers to the name of the conference or journal in which it was published, and "arXiv" denotes the preprint version.

Github Link

The corresponding github link of the work.

Pretrained Link

The storage location of the pre-trained files for this repository (usually hosted on Hugging Face or Google Drive).

Website Link

The webpage address for this work.

⭐ Dataset Link

The dataset included with the work itself, as well as the datasets provided by the SJTU-TES team that are relevant to this work.

⭐ Space Link

The reproduction of certain CPU-based work using the free space service provided by Hugging Face. You can visit the corresponding space to experience some practical applications of this work.

⭐ Repro Link

There are some repositories that can only be run on GPUs (taking several hours or even days if run on CPUs), making it impractical to use the free space service provided by Hugging Face. Therefore, we provide reproducible repositories (including instructive README.md files) to address this limitation.

Content

1. AIGC
1.1 Text2Img	1.2 Text2Video
1.3 Img2Text	1.4 Img2Img
1.5 DeepFake	1.6 FakeDetect
2. CO
2.1 Graph Matching (GM)	2.2 Graph Edit Distance (GED)
2.3 Travelling Salesman Problem (TSP)	2.4 Maximum Independent Set (MIS)
3. Website
3.1 Online Chatting	3.2 Web Scraping
3.3 Cpp Online
4. Motion
4.1 Motion Retargeting	4.2 Pose Estimation
4.3 Video Matting
5. Security
5.1 DeepLearning Security	5.2 IOT Security
5.3 Sandbox

AIGC

Text2Img

1.1.1 Stable Diffusion v1.4

Stable Diffusion, a latent text-to-image diffusion model capable of generating photo-realistic images given any text input. stable-diffusion-v1-4 is resumed from stable-diffusion-v1-2 - 225,000 steps at resolution 512x512 on laion-aesthetics v2 5+ and 10 % dropping of the text-conditioning to improve.

1.1.2 Stable Diffusion v1.5

The stable-diffusion-v1-5 checkpoint was initialized with the weights of the stable-diffusion-v1-2 checkpoint and subsequently fine-tuned on 595k steps at resolution 512x512 on laion-aesthetics v2 5+ and 10% dropping of the text-conditioning to improve classifier-free guidance sampling.

Click to view examples we have implemented

Scarlett, nature, (((beauty))), (((smooth)))，white，Highest quality

Text2Video

1.2.1 Latte

Latte, a novel latent diffusion transformer for video generation, utilizes spatio-temporal tokens extracted from input videos and employs a series of Transformer blocks to model the distribution of videos in the latent space. Latte achieves state-of-the-art performance on four standard video generation datasets FaceForensics, SkyTimelapse, UCF101, and Taichi-HD.

Click to view examples we have implemented

Yellow and black tropical fish dart through the sea.
An epic tornado attacking above aglowing city at night.
Slow pan upward of blazing oak fire in an indoor fireplace.
A cat wearing sunglasses and working as a lifeguard at pool.
Sunset over the sea.
A dog in astronaut suit and sunglasses floating in space.

Img2Text

1.3.1 BLIP-2

BLIP-2, Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models, BLIP-2 beats Flamingo on zero-shot VQAv2 (65.0 vs 56.3), establishing new state-of-the-art on zero-shot captioning (on NoCaps 121.6 CIDEr score vs previous best 113.2). Equipped with powerful LLMs (e.g. OPT, FlanT5), BLIP-2 also unlocks the new zero-shot instructed vision-to-language generation capabilities for various interesting applications!

Click to view examples we have implemented

"Question: what is the main elements in the picture? "
"Answer: the eiffel tower"

Img2Img

1.4.1 Stable Diffusion v2

Stable Diffusion v2, high-resolution image synthesis with latent diffusion models, This stable-diffusion-2 model is resumed from stable-diffusion-2-base (512-base-ema.ckpt) and trained for 150k steps using a v-objective on the same dataset.

Click to view examples we have implemented

((two)) ((dogs)) in the picture, ((nature)), (((beauty))), (((smooth)))，white，Highest quality

DeepFake

1.5.1 FaceSwap

FaceSwap, a tool that utilizes deep learning to recognize and swap faces in pictures and videos. FaceSwap supports various operating systems(windows, linux, macos) and offers powerful face swapping capabilities, utilizing a modern GPU with CUDA support for optimal performance. With FaceSwap, users can gather photos and videos, extract faces from them, train a model based on the extracted faces, and then seamlessly swap faces in your sources using the trained model.

1.5.2 Roop

Roop, a fantastic tool of taking a video and replace the face in it with a face of users' choices. Users only need one image of the desired face. No dataset, no training.

Click to view examples we have implemented

FakeDetect

1.6.1 UniversalFakeDetect

UniversalFakeDetect, proposes to perform real-vs-fake classification without learning; i.e., using a feature space not explicitly trained to distinguish real from fake images. The authors use nearest neighbor and linear probing as instantiations of this idea. When given access to the feature space of a large pretrained vision-language model, the very simple baseline of nearest neighbor classification has surprisingly good generalization ability in detecting fake images from a wide variety of generative models.

Click to view examples we have implemented

CO

Graph Matching

2.1.1 ⭐Pygmtools

pygmtools, Python Graph Matching Tools, provides graph matching solvers in Python. To make researchers' lives easier, pygmtools support various solvers (linear, quadratic, multi-graph, neural), various backends (numpy, pytorch, jittor, paddle, tensorflow, mindspore). Also, pygmtools is deep-learning-friendly, whose operations are designed to best preserve the gradient during computation and batched operations support for the best performance.

Click to view examples we have implemented

Graph Edit Distance

2.2.1 ⭐GENN-A*

GENN-A*, Graph Edit Neural Network (GENN), aims to accelerate the A* solver for graph edit distance problem based on Graph Neural Network. GENN-A* aided A* algorithm works by replacing the heuristic prediction module in A* by GNN. Since the accuracy of heuristic prediction is crucial for the performance of A*, this approach can significantly improve the efficiency of A*.

Click to view examples we have implemented

Travelling Salesman Problem

2.3.1 ⭐T2T

T2T, Training to Testing. T2TCO framework first leverages the generative modeling to estimate the high-quality solution distribution for each instance during training, and then conducts a gradient-based search within the solution space during testing.

Click to view examples we have implemented

Maximum Independent Set

2.4.1 ⭐T2T

T2T, Training to Testing. T2TCO framework first leverages the generative modeling to estimate the high-quality solution distribution for each instance during training, and then conducts a gradient-based search within the solution space during testing.

Click to view examples we have implemented

Website

Online Chatting

3.1.1 ⭐GNetChat

GNetChat, General Networking Chat Website designed by SJTUGN Group, where students can easily form study groups, create posts, make friends, share essential resources, and collaborate on projects in real-time.

Click to view details

Web Scraping

3.2.1 ⭐VidFetch

VidFetch, an open-source dataset download tool to obtain copyright-free videos from various free video websites.

Click to view details

Cpp Online

3.3.1 ⭐web-cpp

web-cpp, an online platform that enables users to write and execute C++ code directly within their browsers.

Click to view details

Motion

Motion Retargeting

4.1.1 Transmomo

Transmomo, Invariance-Driven Unsupervised Video Motion Retargeting A lightweight video motion retargeting approach that is capable of transferring motion in spite of structural and view-angle disparities between the source and the target.

Click to view details

4.1.2 EDN

EverybodyDanceNow, A simple method for "do as I do" motion transfer: Given a source video of a person dancing, we can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves.

Click to view details

Pose Estimation

4.2.1 Openpose

Openpose, Real-time multi-person keypoint detection library for pose estimation 2D real-time multi-person keypoint detection.We provide pytorch implementation of openpose including Body and Hand Pose Estimation.

Click to view details

Video Matting

4.3.1 RobustVideoMatting

RVM, Robust High-Resolution Video Matting with Temporal Guidance RVM is specifically designed for robust human video matting. Unlike existing neural models that process frames as independent images, RVM uses a recurrent neural network to process videos with temporal memory.

Click to view details

Security

DeepLearning Security

5.1.1 ⭐DLSec

DLSec, Deep Learning model security evaluation platform Taking attack paradigms and defense means such as anti-sample, data poisoning, backdoor attacks as examples, We studies and implements mainstream offensive and defensive algorithms for deep learning models, and builds a comprehensive and effective evaluation system for deep learning models from the perspectives of white box model and black box model.

5.1.2 ⭐WDAD

WDAD, Adversarial sample detection based on weak dark textures

Click to view details

5.1.3 ⭐UAP

UAP, Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations" a novel and practical mechanism which enables the service provider to verify whether a suspect model is stolen from the victim model via model extraction attacks.

IOT Security

5.2.1 ⭐WAV2COM

WAV2COM, Your Microphone Array Retains Your Identity: A Robust Voice Liveness Detection System for Smart Speakers

Sandbox

5.3.1 ⭐sandbox

Sandbox, helps you compile and run your C++ code within an isolated Docker image. Using Docker ensures that your code runs consistently and predictably in any Docker-enabled environment. This makes it convenient for you to develop and test your C++ project across different systems or environments.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
docs		docs
space		space
.gitignore		.gitignore
README.md		README.md

SJTU-TES/awesome-sjtu-tes

Folders and files

Latest commit

History

Repository files navigation

1.1.1 Stable Diffusion v1.4

1.1.2 Stable Diffusion v1.5

1.2.1 Latte

1.3.1 BLIP-2

1.4.1 Stable Diffusion v2

1.5.1 FaceSwap

1.5.2 Roop

1.6.1 UniversalFakeDetect

2.1.1 ⭐Pygmtools

2.2.1 ⭐GENN-A*

2.3.1 ⭐T2T

2.4.1 ⭐T2T

3.1.1 ⭐GNetChat

3.2.1 ⭐VidFetch

3.3.1 ⭐web-cpp

4.1.1 Transmomo

4.1.2 EDN

4.2.1 Openpose

4.3.1 RobustVideoMatting

5.1.1 ⭐DLSec

5.1.2 ⭐WDAD

5.1.3 ⭐UAP

5.2.1 ⭐WAV2COM

5.3.1 ⭐sandbox

About

Topics

Resources

Stars

Watchers

Forks

Languages