☆ Yσɠƚԋσʂ ☆
  • 766 Posts
  • 771 Comments
Joined 5Y ago
cake
Cake day: Jan 18, 2020

help-circle
rss

@[email protected] just ran across this model, and it seems to work a lot better than the other ones I’ve tried locally. Seems to be faster as well. One thing I noticed is that it helps to set temperature (which controls randomness) and Top P (which affects variety of output) values to 0 to keep it more focused.



Open source researchers at Princeton, Stanford and Huawei are working on efficient theorem proving using neural networks
* https://arxiv.org/html/2502.07640v2 * https://arxiv.org/html/2502.00212v4 * https://arxiv.org/html/2501.18310v1 The benchmarks to follow are * https://paperswithcode.com/sota/automated-theorem-proving-on-minif2f-test * https://trishullab.github.io/PutnamBench/leaderboard.html
fedilink




The intensity of the waves is very low in absolute terms, so they’re not harmful.

Microwave beaming—using radio-frequency phased array antennas with intensity levels below mid-day sun-light—is deemed less harmful, with potential physiological effects manageable through thermoregulation.

https://restservice.epri.com/publicdownload/000000003002029069/0/Product




I’ve been slowly learning Putonghua for the past two years here. At this point, I just can’t see how anything gets better in the west in the near term, meanwhile life in China is improving by leaps and bounds each and every year.

















the key bit

This represents a potentially significant shift in AI deployment. While traditional AI infrastructure typically relies on multiple Nvidia GPUs consuming several kilowatts of power, the Mac Studio draws less than 200 watts during inference. This efficiency gap suggests the AI industry may need to rethink assumptions about infrastructure requirements for top-tier model performance.











Critical support for the US destroying its scientific and technology base.



That’s going to accelerate the process for sure. Also worth noting that the US education system has been terrible for decades, and they’ve relied on poaching talent around the world to stay at the top.


It looks like China has just reached parity with the west this decade, and now it’s starting to visibly pull ahead. I think we’ll see the pace of that accelerate over time as well given that the very act of catching up requires faster rate of development.


Seems like this is a shocking development for a lot of people who bought into China collapse narratives floating in the west.








The reality is that capitalism is already visibly failing today, and we see mass civil unrest growing across the western world as a result. It’s possible that bandaid solutions like UBI may be attempted, but it’s pretty clear that there would need to be major restructuring of economic fundamentals going forward.


My point is that rapid explosion of automation ushered by robotics as we’re seeing happening in China will make capitalism an unviable economic system.


There’s also a second aspect to this as well. Capitalism is built on consumerism which requires people to work to earn disposable income that they spend on goods and services.


Seems like that would have to be the inevitable result of all this. When machines can do most jobs better than humans, then the whole idea of working for a living stops making sense.









You’re right, the chip they leveraged isn’t actually that old. The key part is that we’re seeing a lot of optimizations happening in software space now that allows to use existing chips more efficiently.


I’m hoping this will go beyond AI stuff as well. Operating systems and a lot of general purpose software is also incredibly bloated. If Chinese companies start optimizing the software stack due to having slower chips, that would be a huge win.


Imagine not being able to understand that new software optimization techniques are continuously being discovered. 🤦


It appears you’ve missed the point here, which is that it turns out you can use older GPUs in creative ways to get a lot more out of them than people realized. Having latest chips isn’t the bottleneck people thought it was.


Batteries are increasingly used for large scale storage now, especially with renewables becoming more prevalent.


That’s not that long to build out this kind of infrastructure basically from scratch.


There’s a whole range of 9000S, 9020, and 9100 of chips now, with 9100 being 6nm. It’s clear and steady progress on display here. Meanwhile, there’s really nowhere to go past 1nm chips using silicon substrate. So, it’s not like western foundries have no path forward now.

There are two paths towards improving performance going forward. First is to start using new substrates the way China is doing with carbon nanotube based chips. This requires phenomenal level of investment that can only really be done at state level. Another path is to improve chip designs the way Apple did with M series. And these aren’t mutually exclusive obviously as better chip designs will benefit from faster substrates as well.


It’s weird to claim there’s no actual change when the change is very visible. China is now making their own chips domestically that are only a generation or two behind the bleeding edge. Also, why does it matter whether they’re subsidizing chip production or not?

The visible progress that you’re asking for will happen when all the pieces of the puzzle come together. China has to develop performant RISCV based chip designs that’s what XiangShan project is doing. They also need to physically build the EUV machines, which is what this article is talking about. Then they will start pumping out chips that are competitive with bleeding edge TSMC chips.

You can look at how other industries like rail, electric vehicles, clean energy, and so on developed in China previously. It’s always the same pattern where there’s a few years of build up, and then there’s an explosion of this tech on a scale nobody has seen before.




Chinese scientists developed a chip using carbon nanotubes and a ternary logic system for faster, more energy-efficient computing than traditional silicon chips. CNTs, made from rolled-up sheets of graphene, offer superior electrical characteristics and ultra-thin structure, making them a promising material for next-generation semiconductors. Other advantages include higher carrier mobility, lower power consumption, and the potential for even smaller transistor sizes.



Indeed, and if it gets there then I imagine it will start outpacing proprietary architectures because the whole world will be contributing to it.


It’s true for me. I generally don’t read through the think part. I make the query, do something else, and then come back to see what the actual output it. Overall, I find it gives me way better answers than I got with the version of R1 I was able to get running locally. Turns out the settings do matter though.


Ultimately what matters is whether it gets the correct answer or not. It’s interesting that yours wasn’t able to do the strawberry test while mine did it with very short thinking cycle.


Yeah mine does not do this, here are the toggles I have on mine which might make a difference

<think>

Alright, let me figure out which number is bigger between 9.21 and 9.1999999999. Hmm, okay. Both numbers start with 9. so the whole number part is the same. The difference must be in the decimal parts.

First, I’ll write them down to compare digit by digit after the decimal point:

  • 9.21 is the same as 9.2100000000 if we extend it to match the same number of decimal places.

  • 9.1999999999 has more 9s in its decimal places.

Starting from the first decimal digit:

  • The tenths place: both have 2 and 1 respectively. Wait, no—hold on. Let me check again.

Wait a second, maybe I messed up. Let’s parse each number correctly:

For 9.21:

  • Tenths place (first after decimal): 2
  • Hundredths place: 1 The rest are zeros beyond that if we need more places.

For 9.1999999999:

  • Tenths place: 1
  • Then, the hundredths is 9, followed by all those 9s up to the tenth decimal digit?

Wait, actually let me clarify the positions properly. Let’s write both numbers with enough decimals:

9.21 can be written as 9.2100000000

9.1999999999 is already given.

Comparing digit by digit after the decimal point:

  • Tenths place: 2 vs. 1 → Here, 9.21 has a higher tenths digit (2 > 1).

Since the first differing digit at the tenths place is already larger for 9.21, there’s no need to look further into the other decimal places because even if all subsequent digits in 9.199… are 9s, they can’t overcome the difference in the tenths place. For example:

The first number has 0.2 in tenths vs the second having only 0.1. The rest of the decimals add up to less than 0.01 for the second number (since it’s 0.09 + 0.009 etc., which sums to 0.09 repeating, approaching but never reaching 0.1).

Therefore, even with all those trailing 9s in 9.199…, its value is still less than 9.2 (since it’s like 9.1 plus almost 0.1 but not quite), so 9.21 is larger. </think>

The number 9.21 is larger than 9.1999999999.

Here’s why:

  1. Compare the tenths place (first digit after the decimal):
    • 9.21 has a 2 in the tenths place.
    • 9.1999999999 has a 1 in the tenths place.

Since 2 > 1, 9.21 is already larger at this stage. Even though 9.1999999999 continues with many 9s afterward, they cannot overcome the difference caused by having only 1 in the tenths place compared to the 2 in 9.21.

Final Answer:

\boxed{9.21}


Well that’s interesting because I got something completely different locally

<think>

Okay, let me figure out how many times the letter "r" appears in the word "strawberry." First, I'll write down the word to visualize it better: S-T-R-A-W-B-E-R-R-Y.

Now, I'll go through each letter one by one. Starting with S, T – that's two letters without an R. The third letter is R, so that's one. Then A, W, B, E – those don't have Rs. Next comes another R, making it two. After that, there's another R, bringing the count to three. Finally, the last letters are Y. 

Wait, let me double-check because sometimes I might miscount. Let me break it down again:

S (1), T (2), R (3) – first R here. Then A (4), W (5), B (6), E (7). Next is R again (8) – that's the second R. Another R comes after (9th position?), so third R, and then Y (10). 

Hmm, maybe I confused the positions. Let me spell it out with numbers:

1. S
2. T
3. R (1st R)
4. A
5. W
6. B
7. E
8. R (2nd R)
9. R (3rd R)
10. Y

Yes, so the letters at positions 3, 8, and 9 are Rs. That totals three times. Wait, does "strawberry" have two or three Rs? Let me confirm by spelling it again: S-T-R-A-W-B-E-R-R-Y. Between the B and E, then after that, it's R-R-Y. So there are three Rs total.
</think>

The letter “r” appears **3 times** in the word "strawberry."  

Here’s the breakdown:  
S T **R** A W B E **R** **R** Y  

Count ’em: 1 (third letter), 2 (eighth letter), and 3 (ninth letter). 🍓

can grab it here

I find it absolutely wild how quickly we went from needing a full blown data centre to run models of this scale to being able to run them on a laptop.


I mean what’s he saying is pure bullshit because we have no idea what it’s going to take to make AGI right now. It’s pretty clear that what he actually wants is to just work people harder for less pay. Even if what he said was true, the solution would be simply to hire more people. The oligarchs keeps trying create a fake labor shortage narrative here to justify more exploitation.



Indeed, the whole panic over people not needing chips because AI got more efficient was misguided. All it means is that it’s more accessible now, so more people will be running AI models locally and they’ll buy chips for that. Companies like OpenAI who were trying to make a business model of selling access to their AI as a service are looking to be only the big losers in all this. If this tech gets efficient enough that you can run large models locally, then there are going to be very few cases where people need to use a service, and even when they do, nobody is going to pay high subscription fees now. Interestingly, Apple seems to have bet right because they seem to have anticipated running AI models locally and started targeting their hardware towards that.


Oh haha, I didn’t realize that. I just kind of assumed it worked with the existing corpus of data. With Kimi it explicitly tells you when it troves websites and how many sites it looked at.