Create the Perfect Bill Design Format for Facilities with airSlate SignNow

Effortlessly manage your documents and streamline eSigning with our intuitive, budget-friendly platform.

Award-winning eSignature solution

Send my document for signature

Get your document eSigned by multiple recipients.
Send my document for signature

Sign my own document

Add your eSignature
to a document in a few clicks.
Sign my own document

Move your business forward with the airSlate SignNow eSignature solution

Add your legally binding signature

Create your signature in seconds on any desktop computer or mobile device, even while offline. Type, draw, or upload an image of your signature.

Integrate via API

Deliver a seamless eSignature experience from any website, CRM, or custom app — anywhere and anytime.

Send conditional documents

Organize multiple documents in groups and automatically route them for recipients in a role-based order.

Share documents via an invite link

Collect signatures faster by sharing your documents with multiple recipients via a link — no need to add recipient email addresses.

Save time with reusable templates

Create unlimited templates of your most-used documents. Make your templates easy to complete by adding customizable fillable fields.

Improve team collaboration

Create teams within airSlate SignNow to securely collaborate on documents and templates. Send the approved version to every signer.

See airSlate SignNow eSignatures in action

Create secure and intuitive eSignature workflows on any device, track the status of documents right in your account, build online fillable forms – all within a single solution.

Try airSlate SignNow with a sample document

Complete a sample document online. Experience airSlate SignNow's intuitive interface and easy-to-use tools
in action. Open a sample document to add a signature, date, text, upload attachments, and test other useful functionality.

sample
Checkboxes and radio buttons
sample
Request an attachment
sample
Set up data validation

airSlate SignNow solutions for better efficiency

Keep contracts protected
Enhance your document security and keep contracts safe from unauthorized access with dual-factor authentication options. Ask your recipients to prove their identity before opening a contract to bill design format for facilities.
Stay mobile while eSigning
Install the airSlate SignNow app on your iOS or Android device and close deals from anywhere, 24/7. Work with forms and contracts even offline and bill design format for facilities later when your internet connection is restored.
Integrate eSignatures into your business apps
Incorporate airSlate SignNow into your business applications to quickly bill design format for facilities without switching between windows and tabs. Benefit from airSlate SignNow integrations to save time and effort while eSigning forms in just a few clicks.
Generate fillable forms with smart fields
Update any document with fillable fields, make them required or optional, or add conditions for them to appear. Make sure signers complete your form correctly by assigning roles to fields.
Close deals and get paid promptly
Collect documents from clients and partners in minutes instead of weeks. Ask your signers to bill design format for facilities and include a charge request field to your sample to automatically collect payments during the contract signing.
Collect signatures
24x
faster
Reduce costs by
$30
per document
Save up to
40h
per employee / month

Our user reviews speak for themselves

illustrations persone
Kodi-Marie Evans
Director of NetSuite Operations at Xerox
airSlate SignNow provides us with the flexibility needed to get the right signatures on the right documents, in the right formats, based on our integration with NetSuite.
illustrations reviews slider
illustrations persone
Samantha Jo
Enterprise Client Partner at Yelp
airSlate SignNow has made life easier for me. It has been huge to have the ability to sign contracts on-the-go! It is now less stressful to get things done efficiently and promptly.
illustrations reviews slider
illustrations persone
Megan Bond
Digital marketing management at Electrolux
This software has added to our business value. I have got rid of the repetitive tasks. I am capable of creating the mobile native web forms. Now I can easily make payment contracts through a fair channel and their management is very easy.
illustrations reviews slider
walmart logo
exonMobil logo
apple logo
comcast logo
facebook logo
FedEx logo
be ready to get more

Why choose airSlate SignNow

  • Free 7-day trial. Choose the plan you need and try it risk-free.
  • Honest pricing for full-featured plans. airSlate SignNow offers subscription plans with no overages or hidden fees at renewal.
  • Enterprise-grade security. airSlate SignNow helps you comply with global security standards.
illustrations signature

Bill design format for facilities

In today’s business environment, a streamlined bill design format for facilities is essential for improving operational efficiency and client satisfaction. airSlate SignNow offers a user-friendly platform that allows businesses to easily create, send, and sign documents electronically, making it an ideal choice for facilities management.

Steps to utilize the bill design format for facilities with airSlate SignNow

  1. Navigate to the airSlate SignNow website from your preferred browser.
  2. Register for a complimentary trial or log into your existing account.
  3. Select the document you wish to sign or distribute for signatures by uploading it.
  4. To make future use easier, convert your document into a reusable template.
  5. Access your uploaded file to modify it: incorporate fillable fields or add necessary details.
  6. Place your signature on the document and create signature fields for the recipients.
  7. Click 'Continue' to configure the eSignature invitation and send it out.

By adopting airSlate SignNow, businesses experience a signNow return on investment thanks to its comprehensive features and flexible pricing model. The platform is designed to cater to small and mid-sized businesses, allowing for straightforward scaling as organizational needs evolve.

With clear and transparent pricing, there are no unexpected charges for support or additional features. Furthermore, all paid plans include robust customer support available around the clock. Start maximizing your document management process today!

How it works

Open & edit your documents online
Create legally-binding eSignatures
Store and share documents securely

airSlate SignNow features that users love

Speed up your paper-based processes with an easy-to-use eSignature solution.

Edit PDFs
online
Generate templates of your most used documents for signing and completion.
Create a signing link
Share a document via a link without the need to add recipient emails.
Assign roles to signers
Organize complex signing workflows by adding multiple signers and assigning roles.
Create a document template
Create teams to collaborate on documents and templates in real time.
Add Signature fields
Get accurate signatures exactly where you need them using signature fields.
Archive documents in bulk
Save time by archiving multiple documents at once.
be ready to get more

Get legally-binding signatures now!

FAQs

Here is a list of the most common customer questions. If you can’t find an answer to your question, please don’t hesitate to reach out to us.

Need help? Contact support

What active users are saying — bill design format for facilities

Get access to airSlate SignNow’s reviews, our customers’ advice, and their stories. Hear from real users and what they say about features for generating and signing docs.

I've been using airSlate SignNow for years (since it...
5
Susan S

I've been using airSlate SignNow for years (since it was CudaSign). I started using airSlate SignNow for real estate as it was easier for my clients to use. I now use it in my business for employement and onboarding docs.

Read full review
Everything has been great, really easy to incorporate...
5
Liam R

Everything has been great, really easy to incorporate into my business. And the clients who have used your software so far have said it is very easy to complete the necessary signatures.

Read full review
I couldn't conduct my business without contracts and...
5
Dani P

I couldn't conduct my business without contracts and this makes the hassle of downloading, printing, scanning, and reuploading docs virtually seamless. I don't have to worry about whether or not my clients have printers or scanners and I don't have to pay the ridiculous drop box fees. Sign now is amazing!!

Read full review

Related searches to Create the perfect bill design format for facilities with airSlate SignNow

Simple bill design format for facilities
Bill design format for facilities word
Free bill design format for facilities
Bill design format for facilities pdf
Bill design format for facilities excel
Bill design format for facilities free download
Invoice template free download
Invoice template Word
video background

Bill design format for Facilities

is scaling dead then why is Mark Zuckerberg building a 2 gwatt Data Center in Louisiana right why is why is Amazon building these multi- gigawatt data centers why is Google why is Microsoft building multiple gigawatt data centers plus buying billions and billions of dollars of fiber to connect them together because they think hey I need to win on scale so let me just connect all the data centers together with super high bandwidth so then I can make them act like one data center right towards one job right so this is this this whole like is scaling over narrative F falls on its face when you see what the people who know the best are spending [Music] [Applause] on great to be here psych you both are in the shop today Dylan this is one of the things we've been talking about all year which is you know how the world of compute is radically changing so Bill why don't you why don't you tell folks who uh who Dylan's all is and uh let's get started yeah we're thrilled have Dylan Patel with us from semi analysis Dylan has has quickly built I think the most respected research group on global semiconductor industry and so what we thought we'd do today is um is dive deep on on the intersection I think between everything Dylan knows from a technical perspective about the architectures that are out there about the scaling about the key players in the market globally the supply chain and the best and the brightest that of people we know are all listening and reading Dylan's work and then connect it to some of the business issues that our audience cares about and see where see where it comes out what I was hoping to do is kind of get a moment in time snapshot of all the semiconductor activity that relates to this big AI wave and try and put it in perspective um Dylan how'd you get into this uh so when I was eight my Xbox broke and I have a immigrant parents I grew up in rural Georgia so I didn't have much to do besides be a nerd and uh I couldn't tell them I broke my Xbox I had to open it up short the temperature sensor and fix it that was the way to fix it didn't know what I was doing at the time but then I stayed on those forms and I became a form Warrior right you know you see those people in the comments always yelling at you Brad you know it's like it's like uh that was me right as a child and you didn't know as a child then but you know awesome um it was just like you know arguing with people online as a child and then being passionate uh as soon as I started making money I was reading earnings from semiconductor companies and investing in them you know my internship money and uh yeah uh reading technical stuff as well of course and then working a little bit and then yeah and and and just tell us give us a quick thumbnail on semi analysis today like what is the business yeah so today we are a semiconductor research firm AI research firm we service companies our biggest customers are all hyperscalers uh the largest semiconductor companies uh private Equity as well as hedge funds and we uh sell data around where every data center in the world is how what the power is in each quarter what how the build outs are going um we sell around Fabs we track all, 1500 Fabs in the world for your purposes only 50 of them matter but like you know all 1500 Fabs around the world uh same thing with the the supply chain of like whether it be cables or servers or boards or Transformer substation equipment we try and track all of this on a very uh number driven basis as well as forecasting um and then we do Consulting around those areas yeah so I mean you know Bill you and I just talked about this I mean for ultimeter our team talks with Dylan and Dylan's Team all the time I think you're right um he's quickly emerged really just through hustle hard work doing doing the grindy stuff that matters um I think is a you know a benchmark for what's going on in the semiconductor industry and we're at this you know I suggested we're two years into this maybe you know this buildout and it's been hyper kinetic and one of the things bill and I are talking about is we we enter the end of 2024 taking a deep breath thinking about 25 26 and Beyond because a lot of things are changing and there's a lot of debates and it's going to have consequence for trillions of dollars of value in the public markets in the private markets how the hyperscalers are investing and where we go from here so Bill why don't you take us uh a little bit through the start of the questions well so I think if you're going to talk about Ai and semiconductors there's only one place to start which is to talk about Nvidia broadly um Dylan what percentage of global AI workloads do you think are on Nvidia chips right now so I would say if you ignored Google it would be over 98% uh but then when you bring Google into the mix it's it's actually more like 70 uh because Google is really that large of a percentage of AI workloads um especially production workloads uh you know by production you mean inhouse workloads for gole production as in things that are making money um things that are making money they're actually probably it's probably even less than 70% right because you think about Google search and Google ads are the two largest you know two of the largest AI driven uh businesses in the world right you know the only things that are even comparable are like Tik Tok and and metas right and those Google workloads I think it's important just to kind of frame this um those are running on Google's proprietary chips they're um they're nonl llm workloads correct so Google's production workloads for non llm and llm run on their internal silicon and I think one of the interesting things is yes you know everyone will say Google dropped the ball on Transformers and llms right how did open AI do GPT right um and not Google but Google was running Transformers even in their search workload since 2018 2019 uh the Advent of Bert which was one of the most uh most well-known most popular Transformers before we got to the GPT Madness um is has been their in their production search workloads for years so uh they run Transformers on their own in in in their search and ads business as well the um going back to this this number You' use 98% if you just look at I guess PE workloads people are purchasing to do work on their own so you take the captives out you're you're at 98 right this is a a dominant Landslide at this moment in time back to Google for a second they also are one of the big customers of Nvidia they they do buy a number of gpus they buy some for you know some YouTube video related workloads internal workload right so not everything internal is like is is is Google uh is a GP is a TPU right uh they do buy some for some other internal workloads but by and large their GPU purchases are for Google Cloud to then rent out to customers right uh because they are well they do have some customers for their internal silicon externally such as Apple um the vast majority of their external rental business for AI uh in terms of cloud business is still gpus and that's Nvidia GPU correct Nvidia gpus why are they so dominant why why is in Nidia so dominant so I like to think of it as like a three-headed dragon right I would say every Semiconductor Company in the world sucks at software except for NVIDIA right so there's software there's of course Hardware um people don't realize that inidia is actually just much better at Hardware than most people they get to the newest Technologies first and fastest because uh they drive like crazy uh towards hitting certain production goals targets they get chips out faster than other people from thought design to uh deployed and then uh the networking side of things right they bought melanox and they've driven really hard with um the networking side of things so those those three things kind of combined to make a three-headed dragon that no other Semiconductor Company can do alone yeah i' call out a piece you did Dylan where you you helped everyone visualize the complexity of one of these modern Cutting Edge Nvidia deployments that involves the racks the the memory the the networking the size of scale of the whole thing uh super helpful I mean there's this comparison often times between companies that are truly Standalone chip companies they're not systems companies they're not infrastructure companies and and Nvidia but I think one of the things that's deeply underappreciated is the level of competitive Moes that Nvidia has you know software is becoming a bigger and bigger component of squeezing efficiencies and you know total cost of operation out of out of these infrastructures so talk to us a little bit about that schema you know that Bill's referring to like there are many different layers of systems architecture and how that's differentiated from maybe you know a custom Asic or or or an AMD right so when you when you look broadly at the GPU right no one buys one chip for running an AI workload right uh models have far exceeded that right um You Look At You know today's Leading Edge models like gp4 was you know over trillian parameters right uh a trillian parameters is over a terab of memory uh you you can't get a chip with that capacity you a chip can't have enough performance to serve that model even if it had enough cap memory capacity so therefore you must tie together many chips together um and so what's interesting is that Nvidia has has seen that and built an architecture that has many chips Network together really well called nvlink uh but fun enough and the thing that uh many ignore is that Google actually did this alongside Broad um you know and they did it before Nvidia right you know today everyone's freaking out about or not freaking out but like everyone's like very excited about nvidia's Blackwell system right it is a rack of gpus that is the purchased unit right it's not one server it's not one chip it's a rack this rack weighs three tons and it has thousands and thousands of cables and all these things that Jensen will probably tell you right extremely complex interestingly Google did something very similar in 2018 right with the TPU um now they couldn't do it alone right they know the software they know they know what the compute element needs to be but they didn't know anything they can't do they can't do a lot of the other difficult things like package design like uh networking and so they had to work with other vendors like broadcom to do this um and because Google had such a unified vision of where AI models were headed they actually were able to build this system the system architecture that was optimized for AI right whereas at the time in video was like well how big do we go um I'm sure they could have tried to scale up bigger but what they saw is the primary workloads didn't require scaling to that degree right now everyone sort of sees this and they're running towards it but nvidia's got Blackwell coming now um competitors like AMD and others have to you know make an acquisition recently to help them get into the system design right because building a chip is one thing but building many chips that connect together cooling them appropriately networking them together making sure that it's reliable at that scale is is a whole host of problems that semiconductor companies don't have the engineers for where would you say Nvidia has been investing the most in incremental differentiation I would say for differentiating NVIDIA has primarily focused on supply chain things which you know might sound like oh well like yeah they're just like ordering stuff no no no no you have to work deeply with the supply chain to build the Next Generation technology so that you can bring it to Market before anyone else does right because if Nvidia stands still they will be eaten up right um they're you know sort of the the Andy Grove only the paranoid Will Survive Jensen is probably the most paranoid man in the world right he he's he's known for many years since before the llm craze all of his biggest customers were building AI chips right before the llm craze his main competitors were like oh we should make gpus and and yet he stays on top because he's bringing to te Market Technologies at at volume that no one else can right and so whether it be in networking whether it be in uh Optics whether it be in uh water cooling right whether it be in you know all sorts of other power delivery all these things he's bringing to Market technologies that no one else has and he has to work with the supply chain and teach those supply chain companies and they're they're helping obviously they have their own capabilities to build things that don't exist today and and Nvidia is trying to do this on an annual Cadence now that's incredible Blackwell Blackwell Ultra Ruben Ruben Ultra they're they're going so fast they're driving so many changes every year of course people are going to be like oh no they're there some delays in Blackwell yeah of course you're Drive look how hard you're driving the supply chain is is that part like how big a part of the competitive Advantage is the fact that they're now on this annual Cadence right because it seems like by going there it almost precludes their competitors from catching up because even if you skate to where Blackwell is right you're already on Next Generation within 12 months he's already planning two or three generations ahead because it's only two two to three years ahead well the funny thing is uh a lot of people in video will say Jensen doesn't plan more than a year or a year and a half out um because they change things and they'll deploy them out that fast right no sem every other Semiconductor Company takes years to deploy you know make architecture changes but you said if they stand still there they would they would have competition like what what would be their area of vulnerability or what would have to play out in the market for other alternatives to take more share of the workloads yeah so so the main thing for NVIDIA is you know hey this workload is this big right it's it's it's well over 100 billion dollars of spend um for the biggest customers they have multiple customers that are spending billions of dollars I can hire enough Engineers to figure out what how to run my model on other Hardware right now maybe I can't figure out how to train on other Hardware but I can figure out how to run it for inference on other Hardware so nvidia's Mo in in inference is actually a lot smaller on software um but it's a lot bigger on T they just have the best hardware now what does what does the best hardware mean it means Capital cost and it means operation cost and it means performance right performance TCO yes um and nvidia's whole Moe here is if they stand still their performance TCO doesn't grow um but interestingly they are right like with Blackwell not only is it way way way faster anywhere from 10 to 15 times on really large models for inference because they've optimized it for very large language models they've also decided hey we're going to cut our margin too somewhat because I'm competing with uh Amazon's you know chip and TPU and and AMD and all these things they've decided to cut their margin too so so between all these things they've they've decided that they need to push performance TCO not 2x every two years right you know More's law right they've decided they need to push performance five performance TCO 5x maybe every year right at least that's what Blackwell is and we'll see what Ruben does but you know 5x Plus in a single year for performance TCO is an insane pace right um and then you stack on top like hey AI models are actually getting a lot better for the same size the cost for delivering llms is is tanking which is going to induce Demand right um so this and just to to clarify one thing you said or at least restated to make sure I think when you said the software is more important for for training you meant Cuda is more of a differentiator on training than it is on inference so so I think a lot of people in the investor Community you know call Cuda which is just like one layer for all of Nvidia software there there's a lot of layers of software but for for Simplicity sake you know regarding networking or what runs on switches or what runs on you know all sorts of things Fleet Management stuff all sorts of stuff that Nvidia makes that we'll just call Kudo for Simplicity sake um but all of this software is is stupendously difficult to replicate in fact no one else has deployments to do that besides the hyperscalers right and a000 gpus is is like a Microsoft inference cluster right it's not it's not a training cluster um so so when you talk about hey what what is what is the difficulty here right on training this is users constantly experimenting right researchers saying hey let's try this let's try that let's try this let's try that I don't have time to optimize and hand ring out the performance I rely on nvidia's performance to be quite good with existing software Stacks or very little effort right um but when I go to inference Microsoft is deploying five six models across how many billions of Revenue right so all of open eyes Revenue plus what 10 billion of inference Revenue yeah so so they have1 billion of Revenue here and they're deploying five models right gb4 40 you know 40 mini and now you know the reasoning yeah the reasoning models um 01 and yeah so so it's like they're they're deploying very few models um and those change what every six months right um so every six months they get a new model and they deploy that so within that time frame you can you can hand ring out the performance and so Microsoft has deployed uh GPD style models on other competitor's Hardware such as AMD um they and and some of their own but mostly AMD and so they can ring that out with software uh because they can spend hundreds of Engineers dozens of Engineers hours uh hundreds of engineer hours or thousands of engineer hours on working this out because it's such a unified sort of workload right I want to I want to get you to comment on this chart this is a chart we showed earlier in the year that I think was kind of a a moment for me with Jensen when he was in I think the Middle East and for the first time he said not only are we going to have a trillion dollars of new AI workloads over the course of the next four years um he said but we're also going to have a trillion dollars of CPU replacement of data center replacement workloads um over the course of the next four years so that's an effort to model it out and I you know we referenced it on on the Pod with him and and he seemed to indicate that it was directionally correct right that he still believes that it's not just about because there's a lot of fuss in the world about um you know pre-training and what if what if pre-training doesn't continue a pace and it seemed to suggest that there was a lot of AI workloads um that had nothing to do with pre-training that they're working on but also that they had all of this data center replacement do you buy that I've heard a lot of people push back on the data center replacement and say there's no way people are going to you know rebuild a CPU data center with a bunch of Nvidia gpus it just doesn't make any sense but he his argument is that an increasing number of these applications even things like Excel and PowerPoint are becoming machine learning applications and require accelerated compute Nvidia has been pushing non- AI workloads for accelerators for a very long time right professional visualization right Pixar uses a ton of gpus right to make every movie um you know all these Seaman engineering applications right all these things do use gpus right um I would say there are drop in the bucket compared to you know AI um the other aspect I would say is and this is sort of a bit contentious with your chart I think but IBM Mainframe sell more volume and revenue every single cycle right so you know yes no one in the bay uses mainframes or talks about main frames but they're still growing right um and so like I would I would say the same applies to CPUs right um to Classic workloads just because you know AI is here doesn't mean web serving is like going to slow down or databasing is is going to slow down now now what does happen is that that line is like this and the AI line is like this um and furthermore right like when you talk about what you know hey these applications they're now ai right you know Excel with co-pilot or word with co-pilot whatever right there's they're going to be they're still going to have all of those classic operations you don't get rid of what you used to have right uh Southwest doesn't stop booking flights they just run AI analytics on top of their flights to maybe you know do pricing better or whatever right um so I would say that's still happens but there is an element of replacement that is sort of misunderstood right which is given how much people are deploying how tight the supply chains for data centers are data centers take longer they're longer time Supply chains unfortunately right um which is why you see see things like what elon's doing but when you when you think about that well how can I get power then right so you can do what cor weave is doing and go to crypto mining companies and just like clear them out and put a bunch of GPS in them right retrofit the data center put data data GPS on them like they're doing in Texas or you can do uh what some of these other folks are doing which is hey well my depreciation for CPU servers has gone from three years to six years in just a handful of years why because Intel's progress has been this right so so in reality like the old Intel CPU is not that much better but all of a sudden over the last couple years amds burst onto the scene uh arm CPUs have burst onto the scene uh Intel's started to write the ship now I can upgrade the the the most the plurality of of Amazon CPUs in their data centers are 24 core Intel CPUs from that were manufactured from 2015 to 2020 same more or less the same architecture this 24 core CPU I can buy 128 or 192 core CPU now today where each CPU core is higher performance and well if I just replace like six servers with one I basically invented power out of thin air right I mean like you know in in effect because these old servers which are six plus years old or even you know they can they can just be deprecated and put so with capex of new servers I can replace these old servers and now you know when I do every time I do that I can throw another AI server in there right so this is sort of the yes there is some replacement I still need more total capacity but that total capacity can be served by fewer machines maybe if I buy new ones um and generally the market is not going to shrink it's still going to grow just nowhere close to what AI is and AI is causing this behavior of I need to replace so I can get power hey this this reminds me of a point Sacha made on the Pod last week that I've seen replayed a bunch of times and I think is fairly misunderstood he said last week on the Pod that he was power and data center constrained not chip constrained what I think it was was more a assessment on the real bottleneck which is Data Centers and power as opposed to gpus because gpus have have come online um and so I think the the case you just made I think helps to clarify that well before before we dive into the alternatives to Nvidia I thought we would hit on this uh this pre-training uh scaling debate that you wrote about in your last piece Dylan and we've already talked about quite a bit but but why don't you give us your view of what's going on there I think Ilia was the one the the most credible uh AI specialist that brought this up and then it got repeated and and and cross analyzed quite a bit and ju bill just to repeat what it is I think IIA said you know data is the fossil fuel of the inter of AI that we've consumed all the fossil fuel because we only have but one internet um and so the huge gains we got from pre-training are not going to be repeated and and some some experts had predicted a data the data would run out uh a year or two ago so it wasn't it wasn't um like out of out of nowhere that that argument came to light anyway let's hear what Dylan has to say so pre-training scaling laws are are pretty simple right you get more compute and then I throw it at a model and it'll get better now what is that that breaks out into two axes right data and parameters right you know the bigger the model the more data the better and there's actually an optimal ratio right so uh Google published a paper called chinchilla which says the optimal ratio of uh data to parameter you know model size and and that's the scaling thing now what happens when the data runs out well I don't really get much more data but I keep growing the size of the model because my my budget for compute keeps growing um this is a bit not fair though right we have barely barely barely tapped video data right so there is a significant amount of data that's not tapped it's just video D data is so much more information than written data right and so therefore you're throwing that away but I think that's like that's part of the the the like you know that there's a bit of misunderstanding there but more importantly text is the most efficient domain right humans generally yes a picture paints a thousand words but if I write 100 words I can probably you can tell figure out FAS and the transcripts of most of those videos were already yeah the transcripts of many of those videos are um in there already but you know regardless the data data is like a big axis now the problem is this is only pre-training right quote pre training a model is more than just the pre-training right there's there's many elements of it um and so people have been talking about hey time compute yeah that's important right you can Community Scale Models if you figure out how to make them think and recursively like be like oh that's not right let me think this way oh that's not right that let me you know much like you know you don't H hire an intern and say hey what's the answer to x or you don't hire a PhD and say hey what's the answer to x you're like go work on this and then they come back and bring something to you so inference time compute is important but really what's more important is as we continue to get more and more compute can we improve models if data is run out and the answer is you can create data out of thin air almost right um in certain domains right and so this is the whole the debate around scaling laws is um how can we create data right and so what is what is what is ilas company doing most likely what is what is Mira's company doing most likely Mir Madi CTO of uh of open AI um what are what are you know all these companies focused on open AI um what are all these companies focused they have gnome Brown who's like sort of one of the big reasoning people on road shows just going and speaking everywhere basically right um what what are they doing right they're saying hey we can still improve these models yes spending compute at inference time is important but what do we do at training time because you you can't just tell a model think more and it gets better you have to do a lot of training time and so what that is is I take the model I take an objective function I have right um what is the square root of 81 right now if I told you the square ask asked many people what's the square root of 81 many could answer but I bet many people could answer if they thought about it more like almost you know a lot more people right maybe it's a simplistic but you say hey let's have the existing model do that let's have it just run every possible uh you know not possible many permutations of this uh start off with say five and then anytime it's unsure Branch into multiple so you start out you have hundreds of quote unquote rollouts or trajectories gener of generated data most of this is garbage right you prune it down to hey only these paths got to the right answer right okay now I feed that and that is now new training data yeah um and so I do this with every possible area where I can do functional verification functional verification I.E hey this code compiles hey this unit test that I have in my code base um how can I generate the solution how can I generate the function okay now now and and you do this over and over and over and over again across many many many different domains where you can functionally prove it's real you generate all this data you throw away the vast vast majority of it but you now have some chains of thought that you can train the model on which then it will learn how to do that more effectively and it generalizes outside of it right and so this is the whole domain now when you talk about scaling laws it's it's point of diminishing returns is kind of uh not proven yet by the way right because it's it's more so hey the scaling laws are a log log access a log I.E it takes 10x more investment to get the next iteration well 10x more investment you know I you know going from 30 million to 300 million 300 million to 3 billion is relevant but when Sam wants to go from 3 billion to 30 billion it's it's a little difficult to raise that money right that's why you know the most recent rounds are a bit like oh crap we can't spend 30 billion on the next run um and so the question is well that's just one access where have we gone on synthetic data oh we're still like very early days right we've we've spent tens of millions of dollars maybe on synthetic data um and is you with synthetic data you used a qualifier in certain domains when they released 01 it also had a qualifier like that in certain domains I'm just saying those two scaling a do better in certain domains and aren't as applicable in others and we have to figure that out yeah I think I think one of the interesting things about AI is is at first in 2022 2023 uh with the release of diffusion models with the release of text models people like oh wow artists are the one that are the most you know out of luck not not technical jobs actually these things suck at technical jobs but with these new a this new axis of synthetic data um and test time compute actually where the areas where we can teach the model we can't teach it what good art is because we have no way to functionally prove what good art is we can teach it to write really good software we can teach it how to do mathematical proofs we can teach it how to engineer systems because there are while there are trade-offs and this is not like it's not just a one zero thing especially on Engineering Systems this is something you can functionally verify is this works or not or this is correct the output and then the model can iterate more often exactly um goes back to the alpha go thing and why that was a a Sandbox that could allow for novel uh novel moves and plays because you could Traverse it and and run synthetically you could just let it let it create and create and create putting on my investor Hat Public investor hat here there is a lot of tension in the world over Nvidia as we look forward at 2025 and this question of pre-training right and if in fact you know we've seen we've plucked 90% of the lwh hanging fruit that comes from pre-training then do people really need to buy bigger clusters and I think there's a view in the world particularly post ilas comments that you know no the 90% benefit of pre-training is gone but then I look at you know the comments out of htan this week um uh you know during their earnings call that all the hyperscalers are building these million uh you know xpu clusters I look at you know the commentary out of x. that they're going to build 200 or 300,000 GPU clusters you know meta reportedly building much bigger clusters Microsoft building much bigger clusters how do you square those two things right if everybody's right and pre-training is dead then why is everybody building much bigger clusters so the scaling right going goes back to what's the optimal ratio what's the how do we continue to grow right just blindly growing parameter count when we don't have any more data or the data is very hard to get at IE because it's video data um wouldn't give you so many gains and then there's also the access of it's a log chart right you need 10x more to get the next jump correct corre right so when you look at both of those oh crap like I need to invest 10x more um and I might not get the full gain because I don't have the data but the the data generation side we are so early days with this right so the point is I'm still going to squeak out enough gain that it's a positive return particularly when you look at the competitive Dynamic um you know our models versus our competitor models so it's a rational decision to go from 100,000 to 200,000 or 300,000 even if you know the the kind of big onetime gain in pre-training is behind us or rather it's it's exponentially more it's logarithmically more expensive to do that game correct right so it's still there like the gain is still there but like the sort of whole like Orion has failed sort of narrative around open AI uh model and they didn't release Orion right they released 01 which is sort of a different axis um it it's partially because hey this is you know because of these like data issues but it's partially because they did not scale 10x right cuz scaling 10x from four to this is actually was like I think this is Gavin's point right wa well I would ALS let's go to Gin in a second one of the reasons this became controversial I think is and I don't Dario and Sam had prior to this moment at least the way I heard him made it sound like they were just going to build the next biggest thing get the same amount of gain they had left that impression and so we get to this place as you described and it's not quite like that and then people go oh what does that mean like it causes them to raise their head up so I think I think um they have never said the chinchilla scaling laws were what delivers us you know AGI right they've had scaling scaling is you need a lot of compute and and guess what if you have you know if you have to generate a ton of data and throw away most of it because hey only some of the paths are good you're spending a ton of compute at train time right and this is sort of the access that is like we may actually see models improve faster in the next six months to a year than we saw them improve in the last year because there's this new axis of synthetic data generation and the amount of compute we can throw at it is we're we're still we're still right here in the scaling law right we're not here we haven't pushed it to billions of dollars spent on synthetic data generation functional verification reasoning training we've only spent Millions tens of millions of dollars right so what happens when we scale that up so there's there is a new axes of spending money and then there's of course test time compute as well I.E spending time at inference to get better and better so it's possible um and in fact many people at these Labs believe that the next year of gains or the next six months of gains will be faster because they've unlock this new axis through a new methodology right and it still scale right because this requires stupendous amounts of compute you're generating so much more data than exists on the web and then you're throwing away most of it but you're generating so much that you have to you have to run the model constantly right what what domains do you think are are most applicable with this approach like where where where will synthetic data be um most effective and maybe maybe you could do a a both a pro and a like a scenario where it's going to be really good and one where it wouldn't work as well yeah so I I think that that goes back to the point around what can we functionally verify is true or not what can I grade and it's not subjective what class can you know you take in college and you get the card you get the thing back and you're like oh this is BS or like dang I messed that up right there's certain classes where you can a determinism of of grading the output right exactly so if there if if it can be functionally verified amazing if it has to be judged right so there's sort of two ways to judge an output right there is you know W without using humans right this is sort of the whole scale AI right what what were they initially doing they were using a ton of uh to to create good data right label data but now humans don't scale for this level of data right humans post on the internet every day and we've already tapped that out right kind of more or less on what what are domains that work so so this these are domains where hey in Google when they push data to you know any of their services they have tons of unit tests these unit tests make sure everything's working well why can I have the llm just generate a ton of outputs and then use those unit tests to grade those outputs right because it's pass or fail right it's not um and then you can also like grade these outputs in other ways like oh it takes this long to run versus this long to run so then you have various there's other areas such as like hey image Generation Well actually it's harder to say which which image looks more beautiful to you versus me you know I might I might like you know some like sunsets and flowers and you might like the beach right you you can't really argue what is good there so there's no functional verification there is only um subjective right so the objective nature of this is where so where do we have objective grading right right we have that in code we have that in math we have that in engineering um and while these can be complex like hey engineering is not just this is the best solution it's hey given all the resources we've had and given all these trade-offs we think this is the best trade-off right that's usually what engineering ends up being well I can still I can still look at all these axes right whereas in in in subjective things right like hey what's the best way to write this email or what's the best way to negotiate with this person that's difficult right uh that's not something that is objective what are you hearing from the hyperscalers I mean they're all out there saying our capex is going up next year we're building larger clusters um you know is that in fact happening like what's happening out there yeah so I think when you look at the streets estimates for capex they're all far too low um you know based on a few factors right um so when we we track every data center in the world and it's it's it's insane how much especially Microsoft and now uh meta and Amazon and and and uh you know and many others right but those guys specifically are spending on Data Center capacity and as that power comes online which you can track pretty easily if you look at all of the different uh regulatory filings and you satellite imagery all these things that we do you can see that hey they're going to have this much data center capacity right right um what you it's accelerating what are you going to fill in there right it turns out you you have to fill to fill it up you know you can make some estimates around how how much power is each GPU Allin everything right satcha said he's going to slow down that a little bit but they' signed deals for next year rentals right for some in some of these cases right part of the reason he said is he expects his Cloud Revenue in the first half of next year to accelerate because he said we're going to have a lot more data center capacity and we're currently capacity constraint so you know what they're you know like again going back to the is scaling dead then why is Mark Zuckerberg building a 2 gwatt Data Center in Louisiana right why is why is Amazon building these multi- gigawatt data centers why is Google why is Microsoft building multiple gigawatt data centers plus buying billions and billions of dollars of fiber to connect them together because they think hey I need to win on scale so let me just connect all the data centers together with super high bandwidth so then I can make them act like one data center right towards one job right so this is this this whole like is scaling over narrative F falls on its face when you see what the people who know the best are spending on right you talked a lot at the beginning about nvidia's differentiation around these large coherent clusters that are used in pre-training um can you see anything like I guess one someone might be super bullish on inference and keep building out a data but they might have thought they were going to go from 100,000 noes to 200 to 400 and might not be doing that right now if if this pre-training thing is real are you seeing anything that gives you any visibility on that Dimension so um when you think about training a neural network right it is it is doing a forwards pass and a backwards pass right forwards pass is generating the data basically um and it's half as much compute as the backwards pass which is updating the weights when you look at this new paradigm of synthetic data generation grading the outputs uh and then training the model you going to do many many many forward passes before you do a backwards pass what is serving a user that's also just a forwards pass so it turns out that there is a lot of um inference in training right in fact there's more inference in training than there is updating the model weights because you have to generate hundreds of possibilities and then oh you only train on a couple of them right so there there is that Paradigm is very relevant um the other Paradigm I would say that is very relevant is um when you're when you're training a model um do you necessarily need to be collocated for every single aspect of it right right um and this is and what's the answer the answer is depends on what you're doing if you're in the pre-training Paradigm then maybe you don't yeah you need it to be collocated right um You need everything to be in one spot yeah why Microsoft in in q1 and Q2 sign these massive fiber deals right um and why are they why are they building multiple similar size data centers in Wisconsin and Atlanta and and and Texas and so on and so forth right in Arizona why are they doing that because they already see the researches there for being able to split the workload more appropriately which is hey this data center it's not serving users it's running inference it's just running inference and then you know throwing away most of the output because some of the output is good um and because I'm grading it right and it's doing they're doing this while they're also updating the model in other areas so the the whole Paradigm of training you know pre-training is is is not something down it's just it's logarithmically more expensive each for each generation for each incremental Improvement people are finding other ways to but there's other ways to not just continue this but hey I don't need a logarithmic increase in spend to get the next generation of improvement in fact through this reasoning uh training and inference I can get that logiic Improvement in the model without ever spending that now I'm going to I'm going to do both right because this is a because each model jump has unlocked huge value right I me the the you know the thing that I I think so interesting you know I hear Kramer on CNBC this morning you know and they're talking about is this Cisco from 2000 I was in Omaha Bill Sunday night for dinner um you know they're obviously big investors and utilities and they're watching what's going on in the data center buildout and they're like is this Cisco from 2000 so I had my team pull up a chart for Cisco you know 2000 and and we'll show it on the Pod but you know they peaked at like 120 PE right and yeah you know if you look at the fallof that occurred in revenue and in IAH you know and then it had 70% compression in the in the priced earnings multiple right so the priced earnings multiple went from 120 down to something closer to 30 and so I said to you know in this dinner conversation I said well nvidia's you know PE today is 30 it's not 120 right so you would have to think that there would be 70% PE compression from here or that their revenue was going to fall by 70% or that their earnings were going to fall by 70% you know in order to have a Cisco like event we all have post-traumatic stress about that I mean hell I you know I live through that too nobody wants to repeat that but when people make that comparison it strikes me as uninformed right um it's not to say that there can't be a pullback but given what you just told us about the buildout next year given what you told us about scaling laws continuing you know um what do you think when you hear you know the Cisco comparison when people are talking about Nvidia yeah so I I think there's um a couple things that are not fair right Cisco's Revenue a lot of it was funded through uh private credit investments into building out Telecom infrastructure right uh when we look at nvidia's Revenue sources very little of it is private credit right and in some cases yes it's private credit like cor weave right but cor weave is just back stopped by Microsoft there is significant amounts of like different in like what is the source of the capital right the other thing is at the peak of the you know especially when to inflation adjust it um the the private Capital entering the space was much larger than it is today right as much as people say the Venture markets are going crazy throwing these huge valuations at you know all these companies and we were just talking about this before the show but like hey the The Venture markets the private markets have not even tapped in right guess what private Market money like in the Middle East and these Sovereign wealth funds it's not coming in yet has barely come in right why why why why wouldn't there be a lot more spend from them as well right um and so there is a significant amount of the difference of capital the source is positive cash flows of the most profitable companies that have ever lived or ever existed in humanity versus credit speculatively spent right so I think that is like a big aspect uh that also gives it a bit of a a knob right these companies that are profitable will be a bit more rational I think I think Corporate America is investing more in Ai and with more conviction than they did even in the internet wave also maybe we can switch a little bit you've mentioned inference time reasoning a few times now is clearly a new Vector of scaling intelligence and I read some of your analysis recently about how inference time reasoning is way more compute intensive right than simply pre-train you know scaling pre-training why don't you walk us through we have a really interesting graph here about why that's the case um that we'll we'll post as well but why don't you walk us through first just kind of what inference time reasoning is from the perspective of compute consumption why it's so much more compute intensive and so leading to the conclusion that if if this is in fact going to continue to scale as as a new Vector uh of of of of intelligence it it looks like it's going to be even more computer intensive than what came before it yeah so pre-training maybe slowing out or it's it's too too expensive but there's these other aspects of synthetic data generation and inference time compute inference time compute is um on the surface sounds amazing right I don't need to spend more training a model but when you think about it for a second this is actually very very uh this is not the way you want to scale you only do that because you have to right um the way because right think about it gbd4 was trained with hundreds of millions of dollars and it's generating billions of dollars of Revenue hundreds of millions of dollars hundreds of millions of dollars to train GD and it's generating billions of dollars of Revenue So when you say like hey Microsoft's capex is nuts sure but their spend on gp4 was very reasonable relative to the ROI they're getting out of it right now when you say hey I want the next gain um if I just spend you know sort of a large amount of capital and train a better model awesome but if I don't have to spend that large amount of capital and I deploy it you know I deploy this better model without you know at the time of Revenue generation rather than ahead of time when I'm training the model this also sounds awesome but this this comes with this big trade-off right um when you're running reasoning right you you're having the model generate a lot and then the answer is only a portion of that right today when you open up chat GPT use gbd4 40 you say something you get a response correct you send something you get a response whatever it is right all of the stuff that's being generated is being sent to you now you're having this reasoning phase right and open AI does want to show you but you know there's some open source Chinese models like Alibaba and deep seek they've released some open source models which are not quite as good as open ey of course but they show you what that reasoning looks like if you want to and open has really some examples it generates tons of things it's like it sometimes switches between Chinese and English right like whatever it is it's thinking right it's turning it's like this this this this oh should I do it this way should I break it down in these steps and then it comes out with an answer right now on the surface awesome I didn't have to spend any more on R&D or Capital right saying this in the loose terms they don't they don't C they don't treat training models as R&D I think on Microsoft on a financial basis but uh they don't have to treat this they don't have this R&D ahead of time right you you get it at spend time but think about what that means right if for you right for example one simple thing that we you know we've done a lot of tests on is um hey generate me this uh this soft this this code right like make this function great uh I put I describe the function in you know a few hundred words I get back a response that's you know Words awesome and I'm paying per when I do this with 01 or any other reasoning model I'm sending the same response right few hundred s I'm paying for that I'm getting the same response roughly you know thousand s but in the middle there was 10,000 s of it thinking right now what what does that 10,000 s of thinking actually mean means well the model's spinning out 10 times as many s well if Microsoft's generating you know call it $10 billion of inference revenue and their margins on that are good they've stated this right they're anywhere from 50 to 70% um depending on how you count the open AI profit share um you know anywhere from 50 to 70% gross margins their cost for that is a few billion dollars for $10 billion of Revenue right if now obviously the better model gets to charge more right so 01 does charge a lot more but you're now increasing your cost from hey I outputed a th000 s to I outputed 11,000 s I've 10x to my spend to generate now not the same thing right it's higher quality correct right um and and and that's that's only part of it that's deceptively simple it's not just 10x right because if you go look at 01 despite it being the same uh model architecture as GPD 40 it actually costs significantly more per as well and that's because of you know sort of this chart that we're looking at here right um and this chart shows hey what is GPD 40 right if I'm generating you know call it a thousand s right and and that's what GPD 4 on the bottom right is uh of llama 405b this is open model so it's easier to simulate um you know the exact uh metrics of it but you know if I'm doing that I'm keeping my users's uh you know experience of the model constant um I.E the number of s they're getting at the speed then you know when I when I ask it a question it generates the unit it generates the code whatever it is I'm I can group together many users requests I can group together over 256 users requests on one server for llama 405b of of of of Nvidia server right like you know $300,000 server or so when I do this with 01 right because it's doing doing that thinking phase of 10,000 right this is this is basically the whole context length thing context length is not free right uh context length or sequence length means that it has to calculate attention attention mechanism I.E it spends a lot of memory on generating this KB cache and reading this KB cache constantly now the maximum batch size I.E concurrent users I can have is a fraction of that 1/4 to one5 the number of users can currently use this server so not only do I need to generate 10x as many s each each that's generated is four to 5x less users so the cost increase is is stupendous when you think about a a single user cost increase for a single to be generated is four to 5x but then I'm generating 10x as many s so you could argue the cost increase is 50x for an 01 style model on input to Output I knew the 10x CU it was on the original 01 release but with the log scale but I didn't know well it's 10 and it just requires you to have you know again to service the same number of customers you have to have multiples more compute well this there's good news and bad news here Brad which I think is what Dylan's telling us if if you're just selling Nvidia hardware and they Remain the architecture and this is our scaling path you're going to consume way more of it correct um but the margins for the people who are generating on the other end unless they can pass it on to the end consumer are going to you know are going to compress and and the thing is you can pass it on to theend Consumer because hey it's not really like oh oh it's x% better on this Benchmark it's it literally could not do this before and now it can right um and so and they're running a test right now where they're 10 Xing what they're charging the end consumer you know uh and and it's 10x per right corre remember they're also paying for 10x as many s right so it's actually you know the consumer is paying 50x more per query right um but they're getting value out of it because now all of a sudden it can pass certain benchmarks like sbench right software engineering Benchmark right which is just a benchmark for generating like you know decent code right um there's front-end web development right what do you pay frontend web developers what do you pay backend uh developers versus hey what if they use 01 how much more code can they output how much more can they output yes the queries are expensive but they're nothing close to the human right and so each level of productivity gain I get um each level of capabilities jump is a whole new class of tasks that it can do right and and therefore I can charge for that right so this is the whole like axes of yes I spend a lot more to get the same output but you're not you're not getting the same output with this model are we overestimating or under underestimating end demand Enterprise level demand for the 01 model what are you hearing so I would say the 01 style model is so early days people don't even like get it right 01 is like they just cracked the code and they're doing it but guess what right now on you know some of the uh Anonymous benchmarks there you know it's called llm Cy which is like an arena where different llms get to like compete sort sort of and people vote on them there's a Google model that is doing reasoning right now and it's not released yet but it's going to be released soon enough right and thropic is going to release a reasoning model these people are going to one up each other and also they've spent so little compute on reasoning right now in terms of training time and they see a very clear path to spending a lot more I.E jumping up the scaling laws oh I only spent $10 million well wait that means I can jump up two to three logarithms in scaling like that because I've already got the compute um you know I can go from 10 to1 million billion to1 billion on reasoning in such a quick succession and so the this the performance improvements will get out of these models is is humongous right in in the coming you know six months to a year in certain benchmarks where you have functional verifiers um quick question and I we promised we'd go to these Alternatives so we'll have to get there eventually but um if you go back we we've use this internet wave comparison multiple times when all of the venture companies got started on the internet they were all on Oracle and sun and five years later they weren't on Oracle or sun and some have argued it went from a development sandbox world to a optimization world is that likely to happen is there an equivalency here or not and if you could touch on why the the the back end is so Steep and Cheap like you know just you you go a model you know behind or you you like the the Tok the price you can save by just backing up a little bit is nutty yeah yeah so um today right like 01 is stupendously expensive you drop down to 40 it's a lot cheaper you jum down to 40 mini it's so cheap why because now I'm comp with 40 mini I'm competing against llama and I'm competing against deep seek I'm competing against mistr I'm competing against Alibaba and I'm competing against tons of you think those are market clearing prices I I I I think and in addition right there is also the problem of inferencing a small model is quite easy right I can run llama 70b on one AMD GPU I can run llama 70b on one Nvidia GPU and soon enough there will be on one set of Amazon's new tranium right I can sort of run this model on a single chip this is a very easy not I won't say very easy problem still hard it's a quite E bit easier problem than running this complex reasoning or this very large model right and so there is there is that difference right there's also the fact that hey there's literally 15 different companies out there offering API inferences INF inference apis on llama and Alibaba and deeps and mistr like these different models right you're talking about cerebrus and Gro and you know fireworks and all these others yeah fireworks together um uh you know all the all the companies that aren't using their own Hardware now of course grock and cus are doing their own hardware and doing this as well but the the market these the margins here are bad right um you know s sort of we had this whole thing about the inference race to the bottom when mraw relas released their mixw model which was like very revolutionary sort of late last year um because it was such a level of performance that didn't exist in the open source um that it drove pricing down so fast right um because everyone's competing for API what am I as an API provider providing you that like why don't you switch from mine to to to his why because well there's no it's pretty funable right I'm still getting the same s on the same model and so the margins for these guys is much slower so Microsoft's earning 50 to 70% gross margins on open models and that's with the profit share they get to get or the share that they give the right or you know anthropic similarly in their most recent round they were showing like 70% gross margins but that's because they have this model you you step down to here no one uses this model um from you know a lot less people use it from open AI or anthropic because they can just like take the the weights from met uh llama put it on their own server or vice versa go to one of the many competitive API providers some of them being Venture funded some of them uh you know and money right so there's all this competition here so not only are you saying I'm taking a step back and it's easier problem I'm and so therefore like if the model's 10x smaller it's like 15x cheaper to run and on the top on top of that I'm removing that gross margin and so it's it's not 15x cheaper to run it's 30X cheaper to run um and so this is this is sort of the beauty of like well is everything commodity no but like there is a huge Chase to like if you're deploying in Services that's going to be this is great for you a uh B you have to have the best model or you're no one if you're one of the labs right and so you see a lot of struggles for the companies that were trying to build the best models but failing right and arguably not only do you have to have the best model you actually have to have an Enterprise or a consumer willing to pay for the best model right because at the end of the day you know the the best model implies that somebody's willing to pay you these high margins right and that's either an Enterprise or a consumer so I think you know you're you're quickly narrowing down to just a handful of folks who will be able to compete you know in that market on the model side yes I think on the who's willing to pay for these models is I I think of I think a lot more people will pay for the best model right when we use models internally right we have we have we have language models go through every regulatory filing and permit to look at data center stuff and pull that out and tell us where to look and where to not to um and and we just use the best model because it's so cheap right like the data that I'm getting out of it the value I'm getting out of it is so much higher what model are you using um we're using anthropic actually

Show more
be ready to get more

Get legally-binding signatures now!