Main Menu

1min.ai > GPT-5

Started by rcjordan, November 24, 2025, 06:26:04 PM

Previous topic - Next topic

rcjordan

I moved up from GPT-4.1 to GPT-5 and have found that it continues to do well on medical research & script coding.

Yesterday, it recommended getting a product that I had no idea existed (and wasn't even on Amz) ...very narrow-range litmus paper.

This is the 2nd or 3rd time I've purchased products based on GPT recommendation for fixing, testing, etc. It has not offered an affiliate link or a vendor (yet?).

Also, I'm really happy with the 1min Business purchase.

ergophobe

Quote from: rcjordan on November 24, 2025, 06:26:04 PMI'm really happy with the 1min Business purchase.

At least once a week I silently thank you for this tip. I'm frankly surprised that they are still up and running. I thought at the time that if I got a few months to test all the models, that would make it worth the cost. I assumed they would get API access cut off or the cost would go up so much that the service would become worthless, but no, it keeps going so far.

That was a wonderful find.

I've also been experimenting with more expensive models. I move up for a while and then throttle down if I'm running out of credits (if you do a multi-engine chat with all the expensive models and a decent number of tokens, you do burn through credits).

I still don't have a sense of the strengths of each flavor. In other words, I know that the more expensive or more recent model is better at *some* things but not *all* things and that each flavor is designed to have different strengths. One might have huge improvements for coding but be no better than an older, cheaper model for editing a 1-page text, or vice-versa. I've seen writeups, but it's too much to hold in my head. I suppose I should just start each session by saying, "My current task is to do X. What is the best-value model for this?"

rcjordan

Being attuned to keywords, I sometimes notice that GPT-4 -5 have some favorite words they use when they list an answer ...but I had not used these kws in my prompt.  I then make a prompt that uses those words. Seems to work well.

If you're doing medical research, tell it to prepare a "clinical educational handout for primary care doctors & nurses" on the condition/illness.  That has really distilled some looooong answers to 'Lemme just give you the bullets!' It also changes the bot's perspective. Now it's advising medical staff, not patients.

ergophobe

I was thinking more of what each model is good at and when the pricier models are worth it. Like if you look here:

https://www.getpassionfruit.com/blog/chatgpt-5-vs-gpt-5-pro-vs-gpt-4o-vs-o3-performance-benchmark-comparison-recommendation-of-openai-s-2025-models

you'll see that on benchmarks, GPT5 does much better than 4o on "Graduate-Level Science Questions" but o3 does just was well on "Massive Multi-discipline Multimodal Understanding" benchmarks.

As I look over that, though, in most areas 5 offers a minor to major improvement. But then there are the models not offered by OpenAI. Mostly Gemini seems inferior across the board and Claude is generally inferior (except with respect to editing writing). But do they have strengths I don't know or are they simply not as good?

rcjordan

>thinking more of what each model is good at and when the pricier models are worth it

I used to worry about the price differences for the same prompts but with 4 million tokens per month accruing I switched to Spendthrift Mode.

>Gemini

I assume that's what's on G serps now.  The AI blurbs are pretty good on medical and very good on writing a line of script.

rcjordan

>inferior

Once GPT-5 finishes a medical report, I feed it to free Chatgpt and tell it to proof it before publication. It usually gushes over the format, composition, & grammar and offers a few suggestions for clarity and more details but has not yet flagged any errors.

Still, I moved away from Chatgpt because of errors. 

 

rcjordan

<+>

>scripts

I'll bet that StackOverflow's pageviews have really taken it on the chin.

ergophobe

>> StackOverflow

Funny you should say that. I've mentioned how my nephew says his company has changed the interview process.

Old: whiteboard. That became artificial because no professional developer codes from memory. "They all start by looking at StackOverflow and Github to see if it is a solved problem."

Middle: Give them a real-world problem. That became artificial because no professional developer starts at SO anymore. They all start with an LLM.

New: TBD

rcjordan

#8
>stack overflow
meirl

"While the generative AI boom had tons of impact on all sorts of companies, it immediately upended everything about Stack Overflow in an existential way."

Stack Overflow users don't trust AI. They're using it anyway | The Verge

https://www.theverge.com/podcast/844073/stack-overflow-ceo-ai-coding-chatgpt-code-red-interview

rcjordan

With our recent discussion about Gemini's improvements, I was going to going to switch to Gemini Pro 3 on 1min for fairly deep medical research. I'm rethinking that as GPT-5 wins out overall accuracy of 96.3%

Scroll down to General-purpose LLMs in healthcare

Compare 9 Large Language Models in Healthcare
https://research.aimultiple.com/large-language-models-in-healthcare/

rcjordan


ergophobe

This might seem minor, but one thing is that Gemini and Claude return much more scannable formatting than Open AI in the 1min.ai interface.

If I feel like I'm getting a good answer from Gemini, I don't even look at Open AI just because of the formatting. If it mattered a lot like some of your medical stuff, that would be a different deal of course.

I suspect that if I were a power user, I could adjust the prompt to force Open AI to format things nicer.

rcjordan

Mine often asks if would like a pdf version and that is good for a copy-paste With formatting.  You might also try telling it your want to have this 'formatted for printing as a handout' and see what you get.

rcjordan

I blew $20,000,000 fairy dollars on 1min today.  Split about 50/50 $$-wise between GPT-5 & Gemini Pro 3.  Programming-wise, GPT-5 did more for the $$ but I was doing heavy-duty TM scripting and GP3 seemed to do a better job.

>$20,000,000

I had $57M piled up from not using the monthly allotment


>heavy-duty

There is an old js script -nude.js- that used heuristics to detect nudes so webmasters could block them when uploaded. That was where I started. Got it ported to a userscript that could handle ajax feeds. Got it working after a few tries.

Then I went for deleting images by color combinations (red-white-blue = political meme). After more than a few rewrites, got it working on ajax pages, too. (But setting up the color qualifiers is a PITA.)

There were lots & lots of attempts at doing very simplistic image recognition do nuke cats & dogs. Never got it working.

rcjordan

> political meme
>PITA

Got one.

            id: 'High Contrast Meme (WBYO)',
type: 'allOf',
buckets: ['white', 'black', 'yellow', 'orange',],
minEach: { white: 0.05, black: 0.10, yellow: 0.05, orange: 0.02, }, // <= min % of image
minCombined: 0.80, // <= min % of image for all listed colors combined

nukes a bunch like this