-
Notifications
You must be signed in to change notification settings - Fork 11
Future Updates and Known Issues
In ShellSweepX, we've opted not to implement a feedback loop, unlike its predecessor ShellSweepML. This decision was based on the current model's robust performance in detecting webshells.
- Model Efficacy: The existing model demonstrates high accuracy in identifying webshells.
- Acceptable False Positive Rate: We've determined that the current low rate of false positives is within acceptable limits for operational use.
During our development process, we conducted extensive testing on feature selection:
- Initial Testing: Performed in ShellSweepML
-
Feature Range: Experimented with varying numbers of features:
- Tested with 10,000 features
- Tested with 1,000 features
- Settled on 5,000 features as the optimal balance
After thorough evaluation, we concluded that 5,000 features provide the best compromise between model accuracy and computational efficiency. This configuration offers robust webshell detection capabilities without the immediate need for a feedback loop mechanism.
While not currently implemented, we remain open to introducing a feedback loop in future versions if user experiences and evolving threat landscapes indicate a need for continuous model refinement.
We initially tested content searching using the syntax content:"cmd"
. However, this feature was removed from the initial release due to performance issues.
- Increased search delay, significantly impacting user experience
- Operational challenges in implementing it correctly
We aim to reintroduce this feature in a future update, focusing on:
- Optimizing search performance
- Improving the accuracy of content matching
- Balancing functionality with user experience
SSL/TLS is not currently implemented in ShellSweepX.
- Internal Hosting: The primary use case for ShellSweepX is internal network deployment, which may reduce the immediate need for SSL/TLS.
- Security: While internal hosting provides some security, implementing SSL/TLS would enhance data protection and privacy.
- Evaluate the necessity of SSL/TLS based on user feedback and deployment scenarios
- Potentially include SSL/TLS support in future releases to enhance security for various deployment options
If deploying ShellSweepX in an environment where additional security layers are required, consider implementing network-level security measures in the interim.
When starting the server, you may encounter warnings about unpickling estimators from an older version of scikit-learn. This is due to a version mismatch between the saved models (v1.3.0) and the currently installed scikit-learn (v1.5.1).
-
Recommended: Update pickled models
- Retrain and save models using scikit-learn 1.5.1
- Update relevant code to use the new models
-
Alternative: Downgrade scikit-learn
pip install scikit-learn==1.3.0
-
Temporary solution (not recommended for production) Add this code before loading models:
import warnings warnings.filterwarnings("ignore", category=UserWarning)
Note: Option 1 is the best long-term solution. Option 3 should only be used for temporary testing purposes. However, we not noticed any issues even when handling over 7000+ files in the database.
Sending a large number of files simultaneously, especially from an entire server or a directory with numerous files, can potentially overload the server and cause unresponsiveness.
To mitigate this issue, we've implemented a batching system:
$batchSize = 50
$waitTime = 20
This approach:
- Sends files in batches of 50
- Waits 20 seconds between each batch
To optimize performance and prevent overload:
-
Target Specific Directories:
- Focus your scans on directories likely to contain web shells
- Avoid sweeping entire servers if possible
-
Utilize File Extensions:
- Tailor your scans to relevant file types
- Remember: PHP files are less likely on Windows servers, while ASPX files are uncommon on Linux
-
Consider Server Environment:
- Adjust your sweep parameters based on the server's OS and typical web technologies
-
Monitor Performance:
- Keep an eye on server responsiveness during scans
- Adjust batch size or wait time if needed
We are continuously working to enhance the efficiency of the scanning process. Future updates may include:
- Dynamic batch sizing based on server performance
- More granular control over scan parameters
- Improved server-side processing to handle larger volumes of data
Remember: Efficient and targeted sweeping not only prevents server overload but also improves the accuracy and speed of your web shell detection efforts.