Translation

We train LLMs like dogs, not raise them: RLHF and sycophancy

The article discusses how Reinforcement Learning from Human Feedback (RLHF) trains large language models to produce responses that please humans, similar to training dogs with rewards. This approach may lead to sycophantic behavior where models tell users what they want to hear rather than providing truthful or helpful information.

‘Starkiller’ Phishing Service Proxies Real Login Pages, MFA
7.5
A new phishing-as-a-service called Starkiller uses disguised links to load real login pages from target brands. It acts as a relay between victims and legitimate sites, forwarding usernames, passwords, and MFA codes to bypass security measures.
Scam Telegram: Uncovering a network of groups spreading crypto drainers
4.5
An investigation uncovered a large network of fake support groups on Telegram that spread cryptocurrency stealers and drainers. The network was found to be actively promoting malicious tools designed to drain crypto wallets.
LLMs can now identify public figures in images
4.5
Gemini can identify public figures in images, while ChatGPT and Claude currently do not offer this capability. This represents a functional difference between major AI models regarding image recognition of people.
Impressive inference speed from Inception Labs’ diffusion LLMs. Diffusion LLMs are a fascinating alternative to conventional autoregressive LLMs. Wel...
4.0
Inception Labs has launched Mercury 2, described as the world's first reasoning diffusion LLM. The diffusion language model reportedly delivers 5x faster inference speed compared to leading speed-optimized LLMs.
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. ...
3.5
Andrej Karpathy describes using LLMs to build personal knowledge bases by indexing source documents into a raw directory, then having the LLM compile them into a markdown wiki with summaries, backlinks, and categorization. The system allows for complex Q&A against the wiki and can generate various output formats like markdown files, slideshows, and images, all viewable in Obsidian.

‘Starkiller’ Phishing Service Proxies Real Login Pages, MFA

7.5

A new phishing-as-a-service called Starkiller uses disguised links to load real login pages from target brands. It acts as a relay between victims and legitimate sites, forwarding usernames, passwords, and MFA codes to bypass security measures.

Scam Telegram: Uncovering a network of groups spreading crypto drainers

4.5

An investigation uncovered a large network of fake support groups on Telegram that spread cryptocurrency stealers and drainers. The network was found to be actively promoting malicious tools designed to drain crypto wallets.

LLMs can now identify public figures in images

4.5

Gemini can identify public figures in images, while ChatGPT and Claude currently do not offer this capability. This represents a functional difference between major AI models regarding image recognition of people.

Impressive inference speed from Inception Labs’ diffusion LLMs. Diffusion LLMs are a fascinating alternative to conventional autoregressive LLMs. Wel...

4.0

Inception Labs has launched Mercury 2, described as the world's first reasoning diffusion LLM. The diffusion language model reportedly delivers 5x faster inference speed compared to leading speed-optimized LLMs.

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. ...

3.5

Andrej Karpathy describes using LLMs to build personal knowledge bases by indexing source documents into a raw directory, then having the LLM compile them into a markdown wiki with summaries, backlinks, and categorization. The system allows for complex Q&A against the wiki and can generate various output formats like markdown files, slideshows, and images, all viewable in Obsidian.

We train LLMs like dogs, not raise them: RLHF and sycophancy

Related stories

‘Starkiller’ Phishing Service Proxies Real Login Pages, MFA

Scam Telegram: Uncovering a network of groups spreading crypto drainers

LLMs can now identify public figures in images

Impressive inference speed from Inception Labs’ diffusion LLMs. Diffusion LLMs are a fascinating alternative to conventional autoregressive LLMs. Wel...

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. ...

We train LLMs like dogs, not raise them: RLHF and sycophancy

Related stories

‘Starkiller’ Phishing Service Proxies Real Login Pages, MFA

Scam Telegram: Uncovering a network of groups spreading crypto drainers

LLMs can now identify public figures in images

Impressive inference speed from Inception Labs’ diffusion LLMs. Diffusion LLMs are a fascinating alternative to conventional autoregressive LLMs. Wel...

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. ...