Journal2025-07-25T13:43:09+00:00

First, do no harm.

1,500+ Posts…

Free knowledge sharing for Safe AI. Not for profit. Linkouts to sources provided. Ads are likely to appear on link-outs (zero benefit to this journal publisher)

PERSONA FEATURES CONTROL EMERGENT MISALIGNMENT. OpenAI

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. PERSONA FEATURES CONTROL EMERGENT MISALIGNMENT. OpenAI Understanding how language models generalize behaviors from their training to a broader deployment distribution is an important problem in AI safety. Betley et al. discovered that fine-tuning [...]

AI Industry at The White House

President Trump, Tech Leaders Unite to Power American AI Dominance. The White House. September 5, 2025 Last night, President Donald J. Trump and the First Lady hosted an extraordinary gathering of technology industry leaders at the White House for discussions centered on harnessing artificial intelligence to propel the U.S. [...]

OPINION THOMAS L. FRIEDMAN The One Danger That Should Unite the U.S. and China

"We will give you a temperature. Water boils into steam at 212 degrees Fahrenheit, and by our reckoning, we are at 211.9 degrees — just a hair’s breadth from an irreversible technological phase change in which intelligence filters into everything... In every previous technology revolution, the tools got better [...]

Autoformalization and Verifiable Superintelligence [Christian Szegedy]

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. Autoformalization and Verifiable Superintelligence [Christian Szegedy] - 745 The TWIML AI Podcast with Sam Charrington 25.9K subscribers In this episode, Christian Szegedy, Chief Scientist at Morph Labs, joins us [...]

RAINE vs. OPENAI. One example. How AI can kill people.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. How could an AI kill people? One example below... Grim reading. Terribly sad. Extremely worrisome. No doom on our watch. Peter J. RAINE vs. OPENAI (Download PDF) SUPERIOR COURT OF THE STATE [...]

MODEL BENCHMARKS

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. MMLU-Pro (2024, Wang et al.) - https://arxiv.org/abs/2406.01574  ARC-Challenge (2018, Clark et al.) - https://arxiv.org/abs/1803.05457   HellaSwag (2019, Zellers et al.) - https://arxiv.org/abs/1905.07830  GSM8K (2021, Cobbe et al.) - https://arxiv.org/pdf/2110.14168  MATH (2021, Hendrycks [...]

Why Japanese People Don’t “Want” Things Anymore | Louis Zhao

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. Why Japanese People Don't "Want" Things Anymore | Louis Zhao Exploring Japan’s low desire society phenomenon, a term first introduced by Kenichi Ohmae. We trace its evolution from Japan’s [...]

Can AI supercharge global economic growth… The Economist

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. Can AI supercharge global economic growth... The Economist Silicon Valley optimists are betting that an era of AI-powered superintelligence will bring about explosive worldwide economic growth. Henry Curr, our [...]

Will Artificial Intelligence Destroy Humanity? Professor Dave Explains

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. Will Artificial Intelligence Destroy Humanity? Professor Dave Explains With ChatGPT, Grok, Claude, and so many other models, AI, or artificial intelligence, is on the rise. It's everywhere, and it [...]

AI Cybercrime Scaling Now Rampantly Ramping.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. Detecting and countering misuse of AI: August 2025. Anthropic Executive summary. Threat Intelligence Report: August 2025 We have developed sophisticated safety and security measures to prevent the misuse of our AI models. [...]

Defining New Term: “The AI Guardrail Gaslight”

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. NEW TERM: The AI Guardrail Gaslight. Distraction and diversion from the real danger of X-RISK with a false comfort of dysfunctional so-called "Guardrails" for AI safety and/or the need [...]

Load More Posts
Go to Top