This project provides a REST API built using FastAPI to detect potential phishing URLs from SMS text messages. It utilizes a combination of LLM(via Google's Gemini), web scraping, and external APIs to analyze and determine the likelihood of a URL being part of a Smishing attempt.
- URL Extraction: Extracts URLs from SMS text messages.
- Phishing Detection: Integrates with an external
criminalip.io
API to determine if a URL is associated with malicious activity. - Web Scraping: Uses Selenium and BeautifulSoup to analyze web page content for phishing indicators.
- LLM Integration: Leverages a Generative AI model (Google Gemini) to perform an in-depth analysis of web page source code, looking for phishing characteristics.
- Response Classification: Classifies URLs into categories like "phishing", "caution", "attention", or "safe" based on the analysis results.
- Python 3.8 or higher
- Poetry or
pip
for package management - ChromeDriver for Selenium (ensure compatibility with your Chrome version)
- A
.env
file with the following environment variables:GEMINI_API_KEY
: Your API key for Google Gemini AI.CRIMINAL_API_KEY
: Your API key for thecriminalip.io
service.
-
Clone the repository:
git clone https://github.com/your-username/phishing-detection-api.git cd phishing-detection-api
-
Install dependencies: If you use Poetry:
poetry install
Or using pip:
pip install -r requirements.txt
-
Set up the
.env
file: Create a.env
file in the project root directory and add your API keys:GEMINI_API_KEY=your_gemini_api_key CRIMINAL_API_KEY=your_criminal_api_key
-
Run the application:
uvicorn main:app --reload
The API will be accessible at
http://0.0.0.0:8000
.
- Endpoint:
/check-sms
- Method:
POST
- Request Body:
{ "sms_text": "[Web발신] 암호화폐 상한가 확실한 정보 http://XXXXpay.support" }
- Response:
{ "result": "phishing" | "caution" | "attention" | "safe" }
- Endpoint:
/llm
- Method:
POST
- Request Body:
{ "sms_text": "A single URL to be analyzed" }
- Response:
{ "result": "[totalscore: <score>]" }
Using curl
to send an SMS text for phishing analysis:
curl -X POST "http://localhost:8000/check-sms" -H "Content-Type: application/json" -d '{"sms_text": "Check out this link http://example.com"}'
The API will analyze the URL(s) and return a result indicating whether the URL is safe or a potential phishing site.
main.py
: The main application file where all endpoints and logic are defined.requirements.txt
: Dependencies required to run the application..env
: Environment file to store API keys (not included in the repository for security reasons).
- FastAPI: A modern, fast web framework for building APIs with Python.
- Pydantic: Data validation and settings management using Python type annotations.
- Selenium: For automating web browser interaction and scraping web content.
- BeautifulSoup: For parsing HTML and extracting specific elements from web pages.
- Google Gemini AI: Used to perform machine learning analysis on web page content.
- CriminalIP API: An external API to check if a URL is associated with malicious activity.
- Fork the repository.
- Create a new feature branch (
git checkout -b feature/new-feature
). - Commit your changes (
git commit -m 'Add new feature'
). - Push to the branch (
git push origin feature/new-feature
). - Create a Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.
Special thanks to the developers and maintainers of the Google Gemini, FastAPI, Selenium, and libraries, which made this project possible.