-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated to Phi-3.5 #93
base: main
Are you sure you want to change the base?
Conversation
|
Lets think critically, unless the ai is being asked to create malicious content then it would be a pressure point. However, Humanify will usually be querying the ai model with simply assessing/summarizing the code internally and then it will return variable or function names that closely resemble their function/usage in the code. A censored model is not inherently discarding info instead it will not respond with the censored topics/phrases. It is unlikely the AI model will be rejecting proper names or throwing an error without the back end censor systems being flawed in the first place. I do think there is merit to supporting diverse set of models, including uncensored. The statement "uncensored is better" is a fallacy as it will be subjective, and the statement is both unverifiable and unfalsifiable. On the other hand, a model that is fine tuned or has data set that is relevant to the code's function will be better as it can help reduce the hallucinations from the AI. |
My suggestion stemmed from the observation that when asking ChatGPT, Gemini, or similar tools to reverse engineer something, they often respond with restrictions. While I know some techniques to bypass this (jailbreaking), I proposed using an uncensored option to conserve pre-instruction tokens. I respectfully disagree with initial non-support for multiple models. Currently, there isn't an optimal free API or local version that works for everyone. Sure, that sticking to a single reliable model would minimize bug reports and issues, facilitating development of a shared database that enhances deobfuscation. By avoiding inconsistencies in variable names across models and encouraging experimentation from the start, we can benefit everyone involved. But allowing for experimentation until this reach a certain level of maturity without pushing everyone to create their own fork would be beneficial. |
Asking for unethical actions would encounter this type of restriction. However, humanify is only asking for the AI to analyze the code and return new names that fit the usage of the variable or function. Clearly reverse engineering has gotten a bad reputation and I believe it should be allowed, but it is not. Instead of brazenly asking for reversing the code(which should be worse as it is a complex subject), it is better to break down the tasks required for reversing by asking the AI for clarification on how the code works, or to refactor the code, or help make the code easier to read. This taps into the coding assistant behaviors instead of whatever the censor classifies the reverse engineering as. Do note: humanify does NOT do this as it only asks to help rename stuff.
I never said to stick to one model. what I alluded to is for whoever wants to use an AI model for whatever specific purpose would want to fine tune an AI model to produce less hallucinations. As it stands, humanify does not need a fine tuned model and the support of other models is to be agnostic/independent from one single source. It also seems that you have the wrong notion on what the AI model is used for. The AI model is NOT used for de-obfuscation. The AI model is used to help create human readable names for variables and functions for un-minification purposes. In no way is the AI touching the code. read the section titled "Don't let AI touch the code" in the project's blog post. https://thejunkland.com/blog/using-llms-to-reverse-javascript-minification
This statement confuses me as humanify does support multiple models, including a localized free model. |
I have a question if you don't mind.
Do you think that using Uncensored models would be better for reverse engineering purposes?