o1 isn’t a chat model (and that’s the point)

Great post! o1/pro is the first model i've used that can do high level software architecture well:

- As you noted, give it all the context -- all relevant code files + existing design docs (RepoPrompt is great for this).

- Ramble about the problem into speech2text for a while

- At the end tell it to present multiple alternatives + reasons to use/not use

Breakthrough capability is lack of sycophancy -- it's the first model I've used where I disagree with it, and then it will hold its ground and convince me that it is right.

Another tip is to have it break up the implementation into discrete steps, outputting all context for each one. Then paste into cursor composer one at a time for the actual coding.

Someone else was saying that after each step, they go back to o1 and have it review the code that cursor wrote. Still need to try that one out!

Expand full comment

Reply (2)

swyx & Alessio

great responses, thanks for sharing

Expand full comment

Jordan Parker

Dude, RepoPrompt is 🔥. IMO, RP needs to be incorporated directly into Chatgpt.

"Someone else was saying that after each step, they go back to o1 and have it review the code that Cursor wrote. Still need to try that one out!" I need to try this too, but so far everything has been functional ☺️

I def need to try making cursor code it out in stages/sections from a fully fleshed out start... I've been patching shit with bandaids, and finally getting the system down:

- sharpen the axe for 50 min... let cursor chop the tree for 10 min.

Expand full comment

Sean Drumm

How does Gemini deep research compare?

Expand full comment

Zbigniew Łukasiak

Hmm you mention adding pdfs to the prompt, I don’t have this feature in GPT Plus (and not only me: https://www.reddit.com/r/OpenAI/comments/1hwli30/pdf_file_uploads_on_o1/). Is that a Pro feature?

Expand full comment

sovremennik

Jan 14

i also cannot upload pdfs/excels to o1pro, only graphic files (photos)

Expand full comment

naveen

How much disconnected rambling can o1 handle? Can I just speak stream of thought about all the discussions and back and forth ideas that happened for a product feature and dump them? I'm clear on the final output that I want. I'm just thinking of o1 can handle all this extra discussion context.

Expand full comment

swyx & Alessio

lots! i ramble all the time and just dump the transcript

Expand full comment

Aram

o1 only accepts ~32k tokens right? So maybe dumping a bunch of thoughts might be better in a separate chat?

Source: llm-stats.com

Expand full comment

Jordan Parker

I'm not sure about o1, but o1-Pro has HUGE context: like ~150K!

I brain dump as a stream-of-conscious in voice mode, refine with typed prompts in the same chat. Then take that as "context" to use in the template featured in this post.

Expand full comment

https://docs.anythingllm.com/installation-desktop/overview

Hi there.

Thanks for this article. :)

I was wondering if you could spare a moment and help me out. I am looking for a portable (or not) local LLM / SLM installation so I can build a LLM (or SLM - small language model) AI assistant (agent) on my laptop. I'll then feed it with various ebooks/data/articles and see how it can speed up (or improve) my learning?

Can you please suggest a solution if you know of any? I appreciate it very much!

I'm searching for a solution and right now reading through the below:

https://medium.com/thedeephub/50-open-source-options-for-running-llms-locally-db1ec6f5a54f

https://semaphoreci.com/blog/local-llm

Expand full comment

swyx & Alessio

for local inference, many folks now use ollama. but you seem to also want a RAG UI - in which case the options you list probably work. personally, i dont care about local, and use Claude Projects or NotebookLM.

Expand full comment

Thank you.

Expand full comment

Vitor Diogo Pinho

Thank you. This is the one advice I was in need to also flip from the "nah" to "wow, can't wait for o3". Glad they featured you in the Neuron, I can see this you going viral. 🔥

Expand full comment

Dr. Daniel Bender

Jan 14

Excellent article, sharing nice insights!

Tl;dr of my following remark: Do not use advanced voice mode for o1

Forming the detailed inputs is likely best done via voice input, as described in the article. Be aware that using the inbuilt advanced voice mode of ChatGPT is not recommended as there is no possibility to correct wrongly transcribed passages or add details you did not think of initially as the AI starts to reply to you if a break is detected. I would fully recommend using the open-source model Whisper on your desktop or the keyboard voice to text capability as provided by many virtual phone keyboards (be aware of the processing taking part in the cloud for the latter). With that, you can edit your request and make it complete, correct errors and then send it to o1 to think about it and provide a useful answer.

Expand full comment

Nathan Lambert

One of those pieces that comes at a time when everyone has been using something in a certain way, but on their own, and then a well time blog cements it into community fact.

Using o1 is best for deep, singular tasks with a lot of context.

Claude and others are better at quickly generating code. O1 is best at analyzing it and fixing a small bug.

Expand full comment

JP Alqueres

Hi Ben, that was nice to read. Thanks for sharing.

I think the problem with unavailable streaming is not because of technology. Because of what you described that o1 has these advisory capabilities as you put it, they must be writing (in memory) and rewriting the answer constantly, distancing from the paradigm of the next token discovery. I think that this is what could make streaming impossible. Perhaps what could be a good streaming strategy for this kind of models would be to "act like a human" streaming their thought process during a conversation. Have you seen Sam Altman himself in interviews, how he thinks before answering and start answering in broader terms and begin narrowing down the answer while he is thinking? Perhaps this could be reproduced in some capacity in the streaming process of such models....

Expand full comment

green-bee

Probably I am missing something very obvious but how do you attach files to o1-pro on ChatGPT?

Expand full comment

Reply (2)

Jordan Parker

Can't attach in o1-p.

o1-Pro doesn't have attachment options, but it does have a HUGE token limit in your prompt.

So, write the first parts of this prompt format, and then copy/paste your "Context/Attachment-Data" from another chat/doc at the end of your prompt, and then submit.

TLDR:

Just dump everything in the chat

Expand full comment

green-bee

Thanks!!!! :)

Expand full comment

Vitor Diogo Pinho

Precision, you can attach up to 4 images. If you have a pdf, use a scroll down printscreen tool. I got to push 20 pages of unselectedable content this way and worked. 👌🏻

Expand full comment

Battle Frog

Thanks - no idea why we cannot attach files directly

Expand full comment

Vitor Diogo Pinho

If I have to guess, it's to make it harder to prompt inject into jailbreaking it, seen a better reasoning model would be more dangerous if handled by unethical people. I hope this is the reason, because the other most likely option is that it has multi-modal limitations because OpenAI were on a timer to release something and maybe multimodal habilities dropped the benchmarks significantly.

Gemini is multimodal from the ground-up, but OpenAI models where trained on multi-modality.

In any case, this is me throwing guesses out there, there's nothing to back this up tbh.

Expand full comment

Cyrielle

Mar 10

Thank you for this very interesting article !

I just disagree with you on one aspect: it is actually pretty good at matching your tone for instance... provided you feed it with enough data.

Like if you feed it with tons of articles by you, asking it to analyze your tone, it will imitate it in a very compelling way.

Not enough to enable you to copy and paste then publish, but enough to save you several hours. :)

Expand full comment

The AI Juggernaut

Mar 4

Really like how this explains the move from chat style prompts to report style briefs. The treat o1 like a new hire analogy makes a lot of sense. Do you think future models will pull context on their own or will we always need to prompt them like this?

Expand full comment

Mike Stanley

Feb 26

This is a fantastic eye opening insight. Dump all the context. It does seem like Claude or 4o could be really helpful at helping you generate a ton of context and preparing your prompt. The iterative nature can serve as a prompt creation assistant and then feed to o1 when you collected all your rambles/thoughts.

Expand full comment

Jeff Long

Jan 17

Great write-up. I started switching back and forth between 4o and o1 in my ChatGPT Plus account to get a feel for the differences. I love the output when I give it a larger initial prompt. Thanks for the post!

Expand full comment

William Robson πΦε

Jan 15

By telling o1 what you want it gives you what you asked for. Society loves a good Echo chamber.

o1 congratulated me on Kristen, one of the most innovative solutions in combating climate change by eliminating the need for refrigerators in residential homes.

How was I going to do this by proposing a solution of utilizing subterranean refrigeration techniques?

There's plenty of subterranean real estate available below residential homes and by creating a system of tunnels we will not only be able to grow food but also store it and deliver it directly to people's basements.

It then offered to build me a business plan.

It gives you what you want.

Expand full comment