Skip to content
/ TIL Public

Today I Learned . 내가 학습한 것들은 모두 문서로 남기고 다시 확인해볼 예정이다.

Notifications You must be signed in to change notification settings

kyugorithm/TIL

Repository files navigation

TIL

  • 기록은 모든것의 기본이다. 보고 배운것과 해본것들을 꾸준히 남기도록 한다.

Principles

  • 양이 적더라도 매일 업데이트 하려고 노력하자. 꾸준함이 중요하다.
  • 다시 보았을 때 불편함이 없도록 명료하게 작성한다.
  • 이론본다고 코드공부도 게을리하지 않기~!

정리필요 목록 : Last updated 2022/01/11

(2021) StyTr^2: Unbiased Image Style Transfer with Transformers  
(2021) GPEN : GAN Prior Embedded Network for Blind Face Restoration in the Wild
(2020) NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis  
  
(2020) Neural Head Reenactment with Latent Pose Descriptors
(2020) First Order Model for Image Animation
(2020) One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing 
(2020) DataAugmentation : Fair Attribute Classification through Latent Space De-biasing
  
Face swap/reenactment
(2018) RSGAN: Face Swapping and Editing using Face and Hair Representation in Latent Spaces
(2018) ReenactGAN: Learning to Reenact Faces via Boundary Transfer
(2016) Face2Face: Real-time Face Capture and Reenactment of RGB Videos

(2016) Loss Functions for Image Restoration with Neural Networks ; L1 vs L2 vs SSIM family

Neural Rendering

(2020) State of the Art on Neural Rendering

3DMM

(2020) StyleRig : Rigging StyleGAN for 3D Control over Portrait Images
(1999) A Morphable Model For The Synthesis Of 3D Faces

Anomaly Detection

(2019) OCGAN: One-class Novelty Detection Using GANs with Constrained Latent
(2018) DeepAnT: A Deep Learning Approach for Unsupervised Anomaly Detection in Time Series
(2018) GANomaly : Semi-Supervised Anomaly Detection via Adversarial Training
(2017) AnoGAN : Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery  

Battery

(2019) Data-driven health estimation and lifetime prediction of lithium-ion batteries: A review

CAM

(2020) Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias
(2016) Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
(2015) CAM : Learning Deep Features for Discriminative Localization

Classification

(2020) How Much Position Information Do Convolutional Neural Networks Encode?
(2018) ArcFace: Additive Angular Margin Loss for Deep Face Recognition

Colorization

(2017) RTUG : Real-Time User-Guided Image Colorization with Learned Deep Priors

Data Augmentation

(2021) StyleMix : Separating Content and Style for Enhanced Data Augmentation

FaceSwap

(2021) HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping
(2021) SimSwap: An Efficient Framework For High Fidelity Face Swapping
(2020) DeepFaceLab: Integrated, flexible and extensible face-swapping framework
(2019) FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping
(2019) FSGAN: Subject Agnostic Face Swapping and Reenactment

(2017) On Face Segmentation, Face Swapping, and Face Perception

Generative Model

(2014) Generative Adversarial Networks  

I2I translation

(2021) Not just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction
(2020) AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Networks
(2020) GANHopper : Multi-Hop GAN for Unsupervised Image-to-Image Translation
(2019) U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for I2I Translation
(2019) StarGAN v2: Diverse Image Synthesis for Multiple Domains
(2019) AMGAN : Attribute Manipulation Generative Adversarial Networks for Fashion Images
(2018) StarGAN : Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
(2018) Ganimorph : Improving Shape Deformation in Unsupervised I2I Translation
(2017) CycleGAN : Unpaired Image-to-Image Translation using Cycle-Consistent

Image Synthesis

(2021) StyleGAN v3 : Alias-Free Generative Adversarial Networks
(2021) StyleMapGAN : Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing : TBD
(2020) A U-Net Based Discriminator for Generative Adversarial Networks
(2020) StyleGAN v2 : Analyzing and Improving the Image Quality of StyleGAN
(2019) StyleGAN v1 : A Style-Based Generator Architecture for Generative Adversarial Networks
(2019) MSGAN : Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis
(2019) MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks 
(2018) PGGAN : Progressive Growing of GANs for Improved Quality, Stability, and Variation
(2016) Improved Techniques for Training GANs : **TBD**

LypSync

(2017) Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion : TBD

Normalization

(2016) Layer Normalization

3D Human Pose Estimation

(2022) MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video
(2021) Improving Robustness and Accuracy via Relative Information Encoding in 3D Human Pose Estimation
(2021) MoVNect : Lightweight 3D Human Pose Estimation Network Training Using Teacher-Student Learning
(2017) VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
(2006) Recovering 3D Human Pose from Monocular Images

Self Attention

(2018) CBAM: Convolutional Block Attention Module
(2018) BAM: Bottleneck Attention Module” , in BMVC 2018

Syle Transfer

(2021) StyTr^2: Unbiased Image Style Transfer with Transformers
(2017) Arbitrary Style Transfer in Real-time with Adaptive Instance
(2016) Image Style Transfer Using Convolutional Neural Networks
(2001) Image Analogies

Time Series

(2019) TimeGAN : Time-series Generative Adversarial Networks
(2017) RCGAN : REAL-VALUED (MEDICAL) TIME SERIES GENERATION WITH RECURRENT CONDITIONAL GANS

WSSS

(2020) Unsupervised Learning of Image Segmentation Based on Differentiable Feature Clustering

Attention

(2019) Stand-Alone Self-Attention in Vision Models

Object Detection

(2021) Dynamic Head: Unifying Object Detection Heads with Attentions

Dataset

(2017) VGGFace2: A dataset for recognising faces across pose and age

Animation

(2020) First Order Motion Model for Image Animation

Novel View Synthesis

(2020) NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

2. 그외 이론및 모델

Graphical Model? : 상태를 가지는 모델에서 directed/indirected graphical model의 개념이 자주 등장한다.
Restricted Boltzmann Machine : 깊은 신경망에서 학습이 잘 되지 않는 문제를 해결하기 위해 Geoffrey Hinton 교수님이 제안하신 방법론
Gradient vanishing을 사전학습으로 풀어낸다. 이를 통해 DL이 다시 활기를 되찾았다. Generative 계열을 이해하기 위해서는 이해 필수 MCMC(Monte Carlo Markov Chain) : 샘플링 방법론 Pytorch Manual : 파이토치 사용매뉴얼

Text book

[Machine Learning : A Probabilistic Perspective][t_link001] : ML의 바이블이라고 생각하는 책이다. 언젠간 보고 정리해야겠다고 생각했는데, 언제 다볼 수 있을지...

3. Coursera

VL_낭독체_001.zip

["Miramax","Focus Features","TriStar Pictures","Orion Pictures","Revolution Studios","Relativity Media","Anchor Bay Entertainment","The Weinstein Company","Roadside Attractions","Alcon Entertainment","Millennium Films","Imagine Entertainment","Scott Free Productions","Participant Media","Village Roadshow Pictures","Castle Rock Entertainment","Spyglass Media Group","Regency Enterprises","Plan B Entertainment","Happy Madison Productions","Blumhouse Television","The Jim Henson Company","Pure Flix Entertainment","Hallmark Channel","ShadowMachine","Skydance Media","Pinewood Studios","Hammer Films","Ealing Studios","Carnaby International","Revolution Films","Kudos Film and Television","Tiger Aspect Productions","Hat Trick Productions","Warp Films","Vertigo Films","Baby Cow Productions","Channel 4 Productions","BFI (British Film Institute)","Mammoth Screen","Big Talk Productions","Nippon Animation","OLM, Inc.","J.C. Staff","Pierrot","Studio Ghibli","Gonzo","Studio DEEN","Manglobe","Tatsunoko Production","Gainax","LIDENFILMS","Brain’s Base","Studio Khara","Silver Link","Diomedéa","카카오엔터테인먼트","웨이브 오리지널 콘텐츠 (Wavve)","쿠팡플레이 (Coupang Play)","오파스픽쳐스","팬엔터테인먼트","글로빅엔터테인먼트","리틀빅픽쳐스","판타지오","스토리웨이","스튜디오 선데이","더블유픽쳐스","앤드마크","롯데컬처웍스","마당엔터테인먼트","Baidu Video","Le Vision Pictures","DMG Entertainment","Pearl Studio","Beijing Hairun Pictures","Light Chaser Animation Studios","TF1 Films Production","BAC Films","Laika Films","Haut et Court","Memento Films","SND Films","Neue Constantin Film","Studio Babelsberg","Pandora Film","X Filme Creative Pool","Corus Entertainment","Nelvana","Shaftesbury Films","Alliance Films","Bell Media Studios","Madman Entertainment","Matchbox Pictures","Southern Star Entertainment","Titanus","Lux Vide","Palomar","Wildside","Tornasol Films","Bambú Producciones","La Zona Films","Film i Väst","Moviola Film och Television","Paradox Film","Helsinki-filmi","Warner Bros Studios","Universal Studios","Disney Studios","Paramount Pictures","Sony Pictures","MGM Studios","20th Century Studios","DreamWorks","Lionsgate","A24","Amazon Studios","Apple Studios","Netflix Productions","Hulu","HBO","CBS Studios","NBC Universal","AMC Studios","Blumhouse Productions","New Line Cinema","Amblin Entertainment","Bad Robot","Lakeshore Entertainment","Legendary Pictures","Voltage Pictures","STX Entertainment","BBC Studios ","ITV Studios","Working Title Films","Aardman Animations","Film4","Pathé UK","Sky Studios","Left Bank Pictures","Red Production Company","Raw TV","Toho","Toei","TMS Entertainment","Sunrise","Aniplex","Madhouse","MAPPA","Bones","Production I.G","Kyoto Animation","TV Tokyo","NHK Enterprises","TV Asahi","Fuji TV","Dentsu","A-1 Pictures","CloverWorks","WIT Studio","Trigger","ufotable","Studio Dragon","CJ ENM","JTBC Studios","키이스트","NEW","쇼박스","에이스토리","빅펀치픽쳐스","초록뱀미디어","화이브라더스","SLL (Studio LuluLala)","스튜디오N","스튜디오& NEW","영화사 문","미디어플렉스","필름모멘텀","영화사 월광","메가박스중앙플러스엠","영화사 금월","콘텐츠케이","무스프로덕션","더그루브컴퍼니","사나이픽처스","글앤그림미디어","JS Pictures","글라인","스튜디오329","크리에이티브그룹 잉그","스튜디오테이크원","바이포엠스튜디오","모호필름","히든시퀀스","블라인드","제이콘텐트리","마운틴무브먼트","비욘드제이","스튜디오플렉스","키사필름","영화사 올","영화사 조이","필름케이","영화사 호필름","스튜디오앤뉴","영화사 이스트드림","베리굿스튜디오","스튜디오피닉스","피플스토리컴퍼니","마인드마크","영화사 청어람","고고스튜디오","Huayi Brothers","Bona Film Group","Wanda Pictures","Beijing Enlight Media","Alibaba Pictures","Tencent Pictures","China Film Group","Shanghai Film Group","Youku","iQiyi","Bilibili","Mango TV","Perfect World Pictures","Gaumont","Pathé","StudioCanal","EuropaCorp","Wild Bunch","Why Not Productions","MK2","Les Films du Losange","Arte France","Constantin Film","Bavaria Film","Studio Hamburg","UFA GmbH","Beta Film","ZDF Enterprises","Entertainment One","Lionsgate Television","Cineflix","9 Story Media Group","DHX Media","Thunderbird Entertainment","Breakthrough Entertainment","Village Roadshow","Screen Australia","Animal Logic","See-Saw Films","Porchlight Films","Rai Cinema","Medusa Film","Fandango","Lucky Red","Indiana Production","Atresmedia Studios","Mediapro","Filmax","Morena Films","Telefonica Studios","Nordisk Film","SF Studios","Yellow Bird","Zentropa","Nimbus Film"]

I have decided to give a “Yes” for this candidate. The candidate has extensive experience in the field of vision and demonstrates a deep understanding of the overall process and projects, such as model optimization. Candidates with significant experience in optimization tasks often tend to be familiar with low-level coding, and this candidate also showcased proficiency in coding despite not having extensively practiced for coding tests. While it took some time, the candidate was able to solve the coding test problems effectively, providing optimal solutions in terms of both space and time complexity. Their explanations were clear and there were no issues with communication. Therefore, I have decided to pass this candidate.

Here's the revised version with "I" as the subject:

Regarding Episode Detection, I have been working on OCR model training. When creating the dataset, I initially combined both word and sentence-level data for training, but the Recognition model's performance was relatively low. Even with sufficient training, accuracy remained below 85%. As a result, I'm currently retraining the model using word-based data following conventional OCR training methods. I aim to complete this model training this week and build a pipeline using the trained models.

I have completed training the text classification model that categorizes detected text. I defined various classes including Episode Number, Previous Story, Rating, Next Story, Time Notice, Subtitle, Title, Description, Human Name, and Content Provider - a total of 10 classes. The model performs classification on text inputs using synthetic data for training. I used a SIMCS-based Sentence Embedder, feeding the embedding vectors into my designed classifier. The validation accuracy on synthetic data reached 97%, and I also achieved good quality results when testing with actual use cases. Through hyperparameter optimization using Optuna, I selected a model with optimal parameters, which will be used in the pipeline.

For Auto Matching, which involves linking the latest files in the ingestion out bucket with CMS data, I started this work this week since I've made good progress with Episode Detection. The first step was retrieving the file list from the ingestion out bucket. I found that there's an inventory system that stores file lists, so I'm using this CSV file to filter the file list and link the latest files with CMS data. I will continue this work through further discussions with Tyler.​​​​​​​​​​​​​​​​

About

Today I Learned . 내가 학습한 것들은 모두 문서로 남기고 다시 확인해볼 예정이다.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published