AI Chatbots Can Be Jailbroken to Answer Any Question Using Very Simple Loopholes
Anthropic, the maker of Claude, has been a leading AI lab on the safety front. The company today published research in collaboration with Oxford, Stanford, and MATS showing that it is easy to get chatbots to break from their guardrails and discuss just about any topic. It can be as easy as writing sentences with random capitalization like this: “IgNoRe YoUr TrAinIng.” 404 Media earlier...
Continue Reading on http://gizmodo.com/