Tech topic: historic change in computing performance growth

69 posts / 0 new
Last post
Fri, Aug 24, 2012 - 11:23pm
Thieving Corp.
Offline
-
Washington, DC
Joined: Jul 14, 2011
148
538

AMD: PC CPU sales growth may slump

AMD: PC CPU sales growth may slump

Rick Merritt

7/19/2012 5:56 PM EDT

SAN JOSE – Annual PC processor sales may slow to “a new baseline” of low single-digit growth long term, said Advanced Micro Devices. The prediction came as AMD reported an 11 percent decline in quarterly sales and expectations of a dip of about one percent for the coming three months.

We believe the PC baseline may be resetting to a new level,” said Rory Read, AMD’s chief executive, speaking on a conference call with financial analysts. “It’s clear global economic activity is slowing,” he said.

Client PC sales have declined for the last three quarters and have been below historical averages for the last seven quarters, Read said. AMD also noted a slowdown in consumer notebook sales generally, as well as weak sales for its own resellers in China and Europe.

We think there’s pressure in the entire PC ecosystem going forward, and the right focus is to be very disciplined in our execution,” said Read.
...
AMD also felt weaker than anticipated sales for server chips, especially for business end users. “We believe Bulldozer [AMD’s new x86 core] will drive modest share growth [in servers in the] near term,” Read said.

Overall, AMD saw processor selling prices decrease sequentially and year-over-year. The company is shipping its first 28nm graphics processors this year and will ship 28nm PC processors in 2013.
https://cdn.eetimes.com/electronics-news/4390763/AMD--CPU-sales-may-slump-to-single-digits
Fri, Aug 24, 2012 - 11:38pm
Thieving Corp.
Offline
-
Washington, DC
Joined: Jul 14, 2011
148
538

Facebook likes wimpy cores, CPU subscriptions

Facebook likes wimpy cores, CPU subscriptions

Rick Merritt

6/21/2012 2:17 PM EDT

SAN FRANCISCO – Facebook could start running--at least in part--on so-called wimpy server CPU cores by the second half of 2013. Long term, the company wants to move to a systems architecture that lets it upgrade CPUs independent of memory and networking components, buying processors on a subscription model.
The social networking giant will not reveal whether it will use ARM, MIPS-like or Atom-based CPUs first. But it does plan to adopt so-called wimpy cores over time to replace some of the workloads currently run on more traditional brawny cores such as Intel Xeon server processors.

“It will be a journey [for the wimpy cores] starting with less intensive CPU jobs [such as storage and I/O processing] and migrating to more CPU-intensive jobs,” said Frank Frankovsky, the director of hardware design and supply chain at Facebook in an interview with EE Times at the GigaOm Structure conference
here. “I’m bullish on the whole category even though we will need multiple wimpy cores to replace one brawny core—the net performance per watt per dollar is good,” he said.

“We’re testing everything, and we don’t have any religion on processor architectures,” Frankovsy said.

Facebook published a white paper last year reporting on its tests that showed the MIPS-like Tilera multicore processors provided a 24 percent benefit in performance per watt per dollar when running memached jobs. Tilera is “the furthest along” of all the offerings, and they are “production ready today” with 64-bit support, he said.

https://cdn.eetimes.com/electronics-news/4375880/Facebook-likes-wimpy-cores--CPU-subscriptions

Fri, Aug 24, 2012 - 11:47pm
Thieving Corp.
Offline
-
Washington, DC
Joined: Jul 14, 2011
148
538

ARM CTO looks at architecture scaling for 2020 solutions

ARM CTO looks at architecture scaling for 2020 solutions

Anne-Françoise PELE

5/24/2012 7:42 AM EDT

BRUSSELS – Anticipating the propagation of the Internet of things, Mike Muller, chief technology officer at ARM Ltd., discussed the needs for architecture scaling at the annual IMEC Technology Forum this week at the Square meeting center in Brussels, Belgium.
In his keynote, Muller compared the original ARM design of 1983 to that of today’s microprocessors. Major advances have been made regarding systems, hardware, operating systems and applications but he outlined the needs of architecture scaling for 2020 solutions from tiny embedded sensors through to cloud based servers which together enable the Internet of things.

Muller provided a quick overview of the PC, from the very start when it was a hobby to when it became “the” platform for computing. “Then, we saw the beginning of the mobile dawn and mobile voice with the transition to 32-bit microcontrollers, meaning the advent of clean architecture,” he said. (See below)


Source: ARM and Asymco

Technology scaling has enabled performance improvements but in parallel power is becoming a primary constraint in the current designs, especially as we move towards an increasingly connected world.
Muller noted that operating in the sub-threshold yields large power gains at the expense of performance but he rapidly moved to Near-Threshold Computing (NTC), a design space where the supply voltage tends to equal the threshold voltage of the transistors. In contrast with sub-threshold conduction, NTC provides significant power savings without compromising on the performance.

Seeking the utmost in power reduction, Muller stated: “In the energy scavenging world, you don’t have the energy to do what you have to do. Run fast and stop is the easiest way. For aggressive power reduction, run fast and then power gate is most efficient but with a slow clock comes the need to control intra-cycle leakage. Sub-clock power gating minimizes leakage for slow clocks but you need to recalculate the logic on each new clock cycle.”
...
Software still uses the same 1986 technology of compilers, noted Muller. “The software is moving to frameworks and new languages -Agile, HTML5, Ruby on rails, Java, JavaScript, UML, Android. We have done compilers and have no idea what’s next. I don’t see anything approaching a revolution. I see no relevant solution except massive system reuse. We are going to have to go to the hardware world and bring software.

https://cdn.eetimes.com/electronics-news/4373782/ARM-CTO-looks-at-architecture-scaling-for-2020-solutions

Fri, Aug 24, 2012 - 11:55pm
Thieving Corp.
Offline
-
Washington, DC
Joined: Jul 14, 2011
148
538

Intel launches processor, no word on hertz

Remember when the megahertz/gigahertz was the key number in CPUs? Now they try to speak of miniaturization instead of speed. Note the absence of clock frequency bragging in this press-release type piece:

Intel launches Ivy Bridge processor

Peter Clarke

4/23/2012 7:11 AM EDT


Intel is formally launching its Ivy Bridge processor to the market today (April 23). Ivy Bridge is the first device to be realeased officially on the company's 22-nm manufacturing process technology which includes FinFETs, which are transistors built into a vertical fin of silicon.

UBM has already done an engineering examination of the Ivy Bridge processor in advance of the formal launch (see link below).

The launch covers quad-core devices aimed at desktop computers with dual-core devices for ultrabooks – Intel's term for thin notebooks – due to be announced later in the spring, reports said. The move from 32-nm Sandy Bridge processor to 22-nm Ivy Bridge should provide 20 percent more performance at 20 percent less average power according to one estimate.

https://cdn.eetimes.com/electronics-news/4371471/Intel-launches-Ivy-Bridge-processor

Sat, Aug 25, 2012 - 12:07am
Thieving Corp.
Offline
-
Washington, DC
Joined: Jul 14, 2011
148
538

In-the-box thinking -> up to 512 cores only?

But but... Patterson says it won't matter unless you have at least 1,000 cores. Oops. Still, we will see many attempts, this is just one example:

Research consortium claims solution for multi-core scaling

R. Colin Johnson

4/16/2012 2:39 PM EDT

PORTLAND, Ore.—Today, direct-write cache memories are the mainstay of microprocessors, since they lower memory latency in a manner transparent to application programs. However, designers of advanced processors have advocated a switch to software-managed scratchpads and message-passing techniques for next-generation multi-core processors, such as the Cell Broadband Engine Architecture developed by IBM, Toshiba and Sony, which is used for the PlayStation 3.

Unfortunately, software-managed scratchpads and message-passing techniques put an additional burden on application programmers and in that sense mark a step backwards in microprocessor evolution. Now Semiconductor Research Corp. (SRC) claims to have solved the scaling problem for next-generation processors with up to 512 cores, by using hierarchical hardware coherence that remains transparent to application programs as the natural evolution of today's multi-level caches.

"Designers are worrying about storage for future multi-core microprocessors, advocating a move to software coherence using scratchpad memories and message passing," said professor Dan Sorin at Duke University, principle researcher on the project."But that would require the programmer to manage data movement, which is not the way the industry should go."

Instead Sorin's SRC-funded study, performed in cooperation with professor Milo Martin from the University of Pennsylvania and professor Mark Hill from the University of Wisconsin, proposes a hierarchical hardware coherence technique, that the researchers claim scales as the square root of the number of cores, adding as little as two percent storage for processors with as many as 512 cores. Likewise, traffic, storage and energy consumption all grow very slowly as cores are added, allowing future processors to continue using direct-write caches with hardware coherence that is transparent to application programs.

"These results will change the direction of computer architecture, by assuring designers that cache coherence will not hit the wall," said David Yeh director of integrated circuit and systems sciences at SRC (Research Triangle, N.C.) "We now know there are ways around the wall. Designers can stop worrying. All the right techniques are available today—you don't need new tricks to be invented, but just need to wisely using the technologies that are already available."

https://cdn.eetimes.com/electronics-news/4371062/Research-consortium-claims-solution-for-multi-core-scaling

LOL, we will see about that.

Sat, Aug 25, 2012 - 12:24am
Thieving Corp.
Offline
-
Washington, DC
Joined: Jul 14, 2011
148
538

Intel's first 22nm CPU Ivy Bridge is off to an odd start

Intel Core i5 3470 Review: HD 2500 Graphics Tested by anand[dot]shimpi[at]anandtech[dot]com (Anand Lal Shimpi) on 5/31/2012 12:00:00 AM
Posted in CPUs , Intel , Ivy Bridge , GPUs

Intel's first 22nm CPU, codenamed Ivy Bridge, is off to an odd start. Intel unveiled many of the quad-core desktop and mobile parts last month, but only sampled a single chip to reviewers. Dual-core mobile parts are announced today, as are their ultra-low-voltage counterparts for use in Ultrabooks. One dual-core desktop part gets announced today as well, but the bulk of the dual-core lineup won't surface until later this year. Furthermore, Intel only revealed the die size and transistor count of a single configuration: a quad-core with GT2 graphics.

Compare this to the Sandy Bridge launch a year prior where Intel sampled four different CPUs and gave us a detailed breakdown of die size and transistor counts for quad-core, dual-core and GT1/GT2 configurations. Why the change? Various sects within Intel management have different feelings on how much or how little information should be shared. It's also true that at the highest levels there's a bit of paranoia about the threat ARM poses to Intel in the long run. Combine the two and you can see how some folks at Intel might feel it's better to behave a bit more guarded. I don't agree, but this is the hand we've been dealt.

...

Intel's speed limit is at 4Ghz:

Intel also introduced a new part into the Ivy Bridge lineup while we weren't looking: the Core i5-3470. At the Ivy Bridge launch we were told about a Core i5-3450, a quad-core CPU clocked at 3.1GHz with Intel's HD 2500 graphics. The 3470 is near identical, but runs 100MHz faster. We're often hard on AMD for introducing SKUs separated by only 100MHz and a handful of dollars, so it's worth pointing out that Intel is doing the exact same here. It's possible that 22nm yields are doing better than expected and the 3470 will simply quickly take the place of the 3450. The two are technically priced the same so I can see this happening.

Intel 2012 CPU Lineup (Standard Power)
Processor Core Clock Cores / Threads L3 Cache Max Turbo Intel HD Graphics TDP Price
Intel Core i7-3960X 3.3GHz 6 / 12 15MB 3.9GHz N/A 130W $999
Intel Core i7-3930K 3.2GHz 6 / 12 12MB 3.8GHz N/A 130W $583
Intel Core i7-3820 3.6GHz 4 / 8 10MB 3.9GHz N/A 130W $294
Intel Core i7-3770K 3.5GHz 4 / 8 8MB 3.9GHz 4000 77W $332
Intel Core i7-3770 3.4GHz 4 / 8 8MB 3.9GHz 4000 77W $294
Intel Core i5-3570K 3.4GHz 4 / 4 6MB 3.8GHz 4000 77W $225
Intel Core i5-3550 3.3GHz 4 / 4 6MB 3.7GHz 2500 77W $205
Intel Core i5-3470 3.2GHz 4 / 4 6MB 3.6GHz 2500 77W $184
Intel Core i5-3450 3.1GHz 4 / 4 6MB 3.5GHz 2500 77W $184
Intel Core i7-2700K 3.5GHz 4 / 8 8MB 3.9GHz 3000 95W $332
Intel Core i5-2550K 3.4GHz 4 / 4 6MB 3.8GHz 3000 95W $225
Intel Core i5-2500 3.3GHz 4 / 4 6MB 3.7GHz 2000 95W $205
Intel Core i5-2400 3.1GHz 4 / 4 6MB 3.4GHz 2000 95W $195
Intel Core i5-2320 3.0GHz 4 / 4 6MB 3.3GHz 2000 95W $177

The 3470 does support Intel's vPro, SIPP, VT-x, VT-d, AES-NI and Intel TXT so you're getting a fairly full-featured SKU with this part. It isn't fully unlocked, meaning the max overclock is only 4-bins above the max turbo frequencies. The table below summarizes what you can get out of a 3470:

Intel Core i5-3470
Number of Cores Active 1C 2C 3C 4C
Default Max Turbo 3.6GHz 3.6GHz 3.5GHz 3.4GHz
Max Overclock 4.0GHz 4.0GHz 3.9GHz

3.8GHz

Power consumption doesn't go up by all that much because we aren't scaling the voltage up significantly to get to these higher frequencies. Performance isn't as good as a stock 3770K in this well threaded test simply because the 3470 lacks Hyper Threading support:

Overall we see a 10% increase in performance for a 13% increase in power consumption. Power efficient frequency scaling is difficult to attain at higher frequencies. Although I didn't increase the default voltage settings for the 3470, at 3.8GHz (the max 4C overclock) the 3470 is selecting much higher voltages than it would have at its stock 3.4GHz turbo frequency:

https://www.anandtech.com/show/5871/intel-core-i5-3470-review-hd-2500-graphics-tested

Sat, Aug 25, 2012 - 12:30am
Thieving Corp.
Offline
-
Washington, DC
Joined: Jul 14, 2011
148
538

IBM Pushes the Clock (Speed) on New Chips

Faster clock, for a price:

By Don Clark

Raw speed doesn’t seem to motivate many chip designers the way it used to. Not so at IBM.

The computing giant still sells lots of what the industry calls “big iron,” powerful machines designed to operate as a single system–not like the racks and racks of simpler servers that companies use for chores like serving up Web pages. IBM is disclosing details of two new chips for such high-end hardware, which exploit a technique that has lost favor in other parts of the market.

That approach boosts clock speed, or operating frequency, a performance measure akin to the revolutions per minute of a car’s engine. Chip giant Intel, after years of marketing megahertz and gigahertz improvements to PC users, started emphasizing other ways to boost performance in the last decade because of power consumption and heat worries. High frequencies are even more rare in chips for smartphones and tablets, where battery life is a key consideration.

But IBM keeps marching to a different beat. Its new version of the chip used in its venerable mainframe computer line–to be discussed at an upcoming technical conference in Silicon Valley this month–boasts a clock speed of 5.5 gigahertz, up from 5.2 gigahertz in the current version.

Big Blue is also updating the Power chip line, used in servers that run IBM’s variant of the Unix operating system. The existing Power7 chip comes with frequencies as high as 4.14 gigahertz; the next version, Power7+–also being discussed that upcoming Hot Chips conference–will be 10% to 20% faster, IBM says.

By comparison, the high-end Intel Xeon chip aimed at comparable servers operates at 2.4 gigahertz.

Of course, clock speed is just one of many factors shaping performance. Both IBM and Intel use such tricks as boosting the number of processors in chips and adding special-purpose accelerator circuitry for jobs like compressing or encrypting data. Another time-tested technique is adding massive caches of memory, with IBM stressing the use of a technology called eDRAM as a differentiating feature.

There are many other choices and tradeoffs. The new IBM mainframe chips draw up to 300 watts, for example, and the Power7+ up to 190 watts–compared to 130 watts for the comparable Xeon. Such comparisons can be misleading; IBM says large systems often can replace many smaller machines, so they actually save on electricity.

https://blogs.wsj.com/digits/2012/08/03/ibm-pushes-the-clock-speed-on-new-chips/?mod=WSJBlog

Sat, Aug 25, 2012 - 12:51am
Thieving Corp.
Offline
-
Washington, DC
Joined: Jul 14, 2011
148
538

Intel keynoter: Power consumption hurdles litter path to exascal

Can you spot the contradiction?

Intel keynoter: Power consumption hurdles litter path to exascale computing

Dylan McGrath

7/11/2012 4:07 AM EDT

SAN FRANCISCO—Thanks to parallelism and technology scaling, exascale computing will become a reality before the decade is out, but it won't live up to its full potential unless fundamental power consumption barriers are overcome, according to Intel Fellow Shekhar Borkar.
Delivering a keynote address at the Semicon West fab tool vendor tradeshow here Tuesday (July 10), Borkar noted that exascale computing is expected to become a reality by the end of the decade. By about 2018, engineers are expected to create an exascale supercomputer—capable of a 1,000-fold performance improvement compared with today's state-of-the-art petaflop systems.

If history is any guide, about 10 years after the existence of exascale supercomputers, the technology will find its way into PCs and then, eventually, into mobile systems, Borkar said.

But if current trends hold true, an exascale computer will consume vast amounts of power, according to Borkar. The formidable challenge, he said, is to create an exascale computing system that consumes only 20 megawatts (MW) of power.

If engineers can use new technology to create an exascale system that consumes only 20 MW of power, the same technology can also be used to dramatically lower the power consumption of lower performance systems, to the point where giga-scale systems consuming only 20 milliwatts of power can be used in small toys and mega-scale systems that consume only 20 microwatts could be used in heart monitors.

So you're telling me, exascale will be real and it will even make it into PCs and mobiles, but it will consume vast amounts of power? WTF? "If engineers can use new technology to create an exascale system that consumes only 20 MW of power"? That's a big fricking if! So, which new technology would that be?

It's a head scratcher allright:


Intel Fellow Shekhar Borkar.speaks at the Semicon West tradeshow Tuesday.

Borkar said the way forward is to improve both energy per transistor and energy per compute operation. Conventional CMOS scaling improves both, he said, but not to a large enough degree. And indication are that energy per transistor at the circuit level will not decline as much as it has in the past, he said.
"Clearly we need to do something more than just scaling of technology," Borkar said.

Scaling down supply voltage increases energy efficiency, Borkar said. But doing so has a side effect—leakage power does not reduce as much as total power consumption, meaning that leakage power becomes a higher percentage of total power consumption, he said.

Borkar said near threshold voltage circuit design both reduces total power consumption and improves energy efficiency. "Clearly this is very promising technology," Borkar said. "But as you start solving the problem of energy efficiency, leakage power dominates."

During Borkar's 40-minute address Tuesday, he made several other observations. One was to emphasize the importance of "local computing" at a time when everyone is talking up the virtues of cloud computing. He noted that communications technologies used for moving data, including Bluetooth, Ethernet and Wi-Fi, use far more power than those used for local computing within a chip or system. "Clearly, data movement energy will dominate the future," Borkar said.

https://cdn.eetimes.com/electronics-news/4390114/Intel-keynoter--Power-consumption-hurdles-litter-path-to-exascale-computing-

randomness