Skip to content

Latest commit

 

History

History
126 lines (97 loc) · 5.15 KB

README.md

File metadata and controls

126 lines (97 loc) · 5.15 KB

NBTI: NN-Based Typography Incoprating Semantics

CS470 Introduction to Artificial Intelligence TEAM P12
NN-Based Typography Incorporating semantics.pdf

preview_img

Team Member

Name Student ID Github
Doojin Baek 20190289 DoojinBaek
Min Kim 20200072 minggg012
Dongwoo Moon 20200220 snaoyam
Dongjae Lee 20200445 duncan020313
Hanbee Jang 20200552 janghanbee

Reference Paper

Word-As-Image for Semantic Typography (SIGGRAPH 2023)

Abstract

We proposed an NN-based typography model NBTI that can visually represent letters, reflecting the meanings inherent in both concrete and formless words well. Our focus was on overcoming the limitations of the previous paper, "Word as Image," and presenting future directions. In the previous paper, the excessive deformation of characters made them unreadable, so the degree of geometric deformation was measured to prevent this. However, this approach limited the expressive capabilities of the characters. We shifted our focus to the readability of the characters. Instead of simply comparing geometric values, we employed a visual model that compared encoded vectors to evaluate how well the characters were recognized, using a metric called "Embedding Loss." Furthermore, the previous model faced challenges in visualizing shapeless words. To address this, we introduced a preprocessing step using LLM fine-tuning to transform these shapeless words into words with concrete forms. We named the module responsible for this transformation the "Concretizer." We used the GPT 3.5 model, specifically the text-davinci-003 variant, and fine-tuned it with 427 datasets. The hyperparameters used for fine-tuning were as follows. The Concretizer module transforms abstract and shapeless words like "Sweet" and "Idea" into words with clear forms like "Candy" and "Lightbulb."

Model Structure

Dataset

Letter classifier dataset

curl http://143.248.235.11:5000/fontsdataset/dataset.zip -o ./data.zip

LLM finetuning dataset

./finetuning/finetuning.jsonl

Setup

  1. Clone the github repo:
git clone https://github.com/DoojinBaek/CS470_NBTI
cd CS470_NBTI
  1. Create a new conda environment and install the libraries:
conda env create -f word_env.yaml
conda activate word
  1. Install diffusers:
pip install diffusers==0.8
pip install transformers scipy ftfy accelerate
  1. Install diffvg:
git clone https://github.com/BachiLi/diffvg.git
cd diffvg
git submodule update --init --recursive
python setup.py install
  1. Execute setup bash file:
bash setup.sh

Run Experiments

python code/main.py --experiment <experiment> --semantic_concept <concept> --optimized_letter <letter> --seed <seed> --font <font_name> --abstract <True/False> --gen_data <True/False> --use_wandb <0/1> --wandb_user <user name> 
  • --semantic_concept : the semantic concept to insert
  • --optimized_letter : one letter in the word to optimize
  • --font : font name, ttf file should be located in code/data/fonts/

Optional arguments:

  • --word : The text to work on, default: the semantic concept
  • --config : Path to config file, default: code/config/base.yaml
  • --experiment : You can specify any experiment in the config file, default: conformal_0.5_dist_pixel_100_kernel201
  • --log_dir : Default: output folder
  • --prompt_suffix : Default: "minimal flat 2d vector. lineal color. trending on artstation"
  • --abstract : Whether the input semantic concept is abstract(formless) or not, default: False
  • --gen_data : Generates the data needed for the first learning, default: False
  • --batch_size : Default: 1

Examples

  1. Formless word: Applying our encoder and concretizer
python code/main.py  --semantic_concept "FANCY" --optimized_letter "Y" --font "KaushanScript-Regular" --abstract "TRUE"


  1. Concrete word: Applying our encoder only
python code/main.py  --semantic_concept "CAT" --optimized_letter "C" --font "Moonies" --abstract "FALSE"