Evidence for the Utility of Quantum Computing before Fault Tolerance | Qiskit Seminar Series

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
thank you and good morning everyone and welcome to the IBM kiss kid live Quantum seminar series dedicated to you the research and academic Quantum communities uh now in just a minute I'll be thrilled to roll out this week's episode with our very own from IBM Quantum Andrew Edens and yongsa Kim who join us here on stage as is our tradition before we get there I'm so glad you joined us on time because we like to give everybody about a minute or two to tune into the live stream and answer our favorite question which is where are you guys tuning in from today you can reply to that in the comment chat box uh here on YouTube located somewhere on your screen above below left or right that's the same place where you can ask questions live of Andrew and youngsuk as we go through this seminar and I'll try to bring those questions up to Andrea and youngsuk and we'd like to keep it Lively have a discussion both between you and our speakers as well as oftentimes there's a discussion that just comes up between you guys as well and I see that today we have folks from New York City from Boston Mexico Colombia Portland Oregon Santa Barbara hello Kelly great to see you again Italy Spain India Netherlands Texas and I see more names streaming in so thank you everybody for making all the different time zones oh ykt Yorktown my former hometown uh so folks um this seminar takes place every Friday at noon eastern time we will have a recording that stays up but you can only ask questions live during the seminar to know what's coming up click like And subscribe and I think with that it's time we introduce our very special seminar today on this recent nature paper from IBM Quantum which has certainly generated a lot of very interesting discussion in the community I'm very excited to briefly introduce here our not one but two speakers Andrew and um and yangsuk who will join us for the 130th episode of this seminar hello Andrew hello youngsek how are you guys today hi zlaco I'm doing good thank you good and uh where are you where where's each of you tuning in from and there you're admitted and we'll give Andrew a minute here since Andrew will be our first speaker yeah I'm tuning in uh New Jersey New Jersey and Andrew I think we got you I'm good now all right hey everyone um I'm tuning in from the IBM office in Cambridge Massachusetts all right so folks according to our tradition uh allow me to introduce Andrew and young son briefly before I hand over the stage youngsuk received his bachelor degree in EE from postdax South Korea his master is in PhD degrees in electric and eecs electrical and engineering and computer science and computer engineering excuse me from the University of Illinois Urbana-Champaign yanksic is now an rsmo research staff member at IBM TJ Watson Research Center uh my home base and Andrew received his Bachelor from Amherst College and his PhD from UC Berkeley with Irfan Siddiqui fun fact same group as my bachelor group and uh Andrew is an IBM research staff member in Cambridge Massachusetts so Andrew I think with that the stage is yours great uh thank you zlat Cohen thank you everyone for the opportunity to speak here um so uh today we'll discuss the results from our paper titled evidence for the utility of quantum Computing before fault tolerance um and this is really a paper about you know what can we hope to do with a noisy quantum computer in the near term so I'll be presenting the first half and then young SEC will take over for discussing more of the experimental results uh and uh Beyond young suck and I a lot of people contributed to this this is very much a team effort so I'd like to acknowledge these folks in IBM our collaborators at UC Berkeley and Professor zalatel's group and also the the broader IBM Quantum team and to make that broader acknowledgment a little more explicit I wanted to just flash up the broader IBM content team that's a little challenging to put on a slide but I wanted to recognize the large effort from many people so just to start with an excerpt from the abstract here we're reporting experiments on a noisy 127 qubit processor and demonstrating the measurement of accurate expectation values for circuit volumes at a scale Beyond Brute Force classical computation so we argue that this represents evidence for the utility of quantum Computing in a pre-fault tolerant era so in other words the main takeaway that we're hoping to convey is that quantum computers today can provide reliable results at a scale that's beyond exact Brute Force classical computation this is not an economy Advantage claim but it does motivate trying to obtain value from using error mitigation to compute to estimate expectation values even in the in the era before fault tolerance um so that's uh maybe a surprising uh surprising notion that we can get value out of a noisy quantum computer or do something with the quantum computer that might be hard for a classical computer certainly looks that way if you think about the error rates on the hardware and the respective hardware for these devices the error rate on a quantum computer if in a good good quantum computer is something like an Avogadro's Number higher than the error rates on a classical transistor um the usual uh conventional wisdom on this is that we'll need to Quantum error correction to overcome this where we'll encode Quantum information in some larger network of qubits to protect that information from errors especially local areas that happen on the device and long term that still seems to be the case that we need error correction to access the full potential of quantum Computing run things like Shore's algorithm and so on um in the near term the near-term reality though is that expected error correction overhead for conventional error correction schemes seems to be prohibitive or at least very limiting is we'll need very large numbers of qubits to relax physical qubits to realize one logical qubit there has been significant progress in improving the quality of devices so this is a distribution of two qubit gate errors across a few generations of Falcon devices we see this generation generally moving left here lower is better um and it also improvements in terms of quantity of devices so going from IBM Falcon with 27 qubits to Hummingbird 65 cubits an eagle with 127 cubits and uh perhaps remarkably it's just with this increase in scale we've also managed to maintain this increase in quality of the gate performance uh and and uh very recently there's also been brought online the Osprey device with 433 qubits so this is encouraging Trends um but if we think about just a simple back of the envelope calculation for like thinking about the Fidelity of of actually running a large circuit and say we're targeting something with 100 qubits and 100 layers of two qubit entangling gates so 100 gate depth this gives us in the ballpark of ten thousand two cubic Gates modular factors of two um and then even if our two cubic gate Fidelity is in the order of three nines the estimated State Fidelity at the end of that circuit is uh you know going to be something like 10 to the minus 4 when that number ideally should be you know one um so that's a discouraging prediction but we can maybe ask if there might be a loophole and that this number is describing how well the full Quantum state has been preserved as we might think about for Quantum air correction but but could it be that that's a more difficult task than just estimating properties of of the quantum state of Interest and it's just a simple cartoon to motivate this possibility we can imagine there's some state that we're interested in for a spin system suppose this state is fully magnetized so all the spins are pointing up and the true value of interest is a magnetization of one that's what we'd like to to be able to to learn we try to prepare this data like a noise economy computer an error happens and this minimal example let's say it just flips a single bit um the Fidelity of this case of this state with respect to the Target State now is zero because it's the two states are orthogonal but the magnetization is largely unchanged it's still 0.998 and that may be a perfectly acceptable estimate of the aspect of the property of Interest so again this is a simple example but it at least illustrates that there might be some separation between fidelities and the ability to get out useful results so our expectation values something that we're interested in measuring we think so you can look at properties of spin systems like magnetization or correlation functions you can look at machine learning kernels optimization cost functions molecular energies uh and chemical simulations um and generally measuring expectation values from these shallow Quantum circuits seems to be a core primitive of most near-term algorithms that have been proposed so the way that we access this information accurately on a noisy quantum computer is through Quantum error mitigation so just to briefly introduce the idea you know Eric whereas error correction we'd be trying to detect individual errors as they occur during an experiment and essentially use that information to repair the quantum wave function in real time an error mitigation will instead be learning the distribution of errors that happen on the hardware so essentially a noise model as we'll call it through the rest of the talk and use that information to instead repair the classical average results in post-processing so this can be an easier task though it does come at typically an exponential uh computational cost so these ideas were introduced in some of these earlier papers and you can check out the references for more information generally one example of quantum error mitigation that we'll focus on is zero noise extrapolation or zne it's a relatively straightforward idea and that we will run the experiment doesn't try to try to estimate some expectation value of interest uh that result may be biased from the True Value due to noise and errors that in the hardware um but if we can somehow and we'll discuss the implementation later uh somehow amplify this noise uniformly across the device we can repeat the experiment in that noise Amplified condition get a result that's worse but then look at the trend and estimate what would the result be if the noise were removed or turned down and this rather simple idea has proven to be relatively proven to be quite fruitful across a range of experiments for example this simulation of lithium hydride the red dots were the original unmitigated results so a corresponding to this point in this cartoon the noise Amplified points are a little bit further from the results or again not worrying about the method but we're essentially amplifying the effect of the noise on the hardware um and then the extrapolated points of the solid green which are much closer to the correct result for this model of this molecule so that since this slide is now looking at scaling up this technique to larger experiments so on the vertical axis uh the the two axes represent the size of the experiment of the quantum circuit that's been run on the vertical axis is the number of qubits horizontal the number of layers of two cubic Gates or the depth of the circuit in the lower left we have the experiment from the previous slide several additional experiments using another technique known as probabilistic error cancellation um and then uh also this someone recently this result using zero noise extrapolation on a falcon ship using 26 qubits um and a gate depth of 120 layers of C knots and was still that was still able to estimate expectation values correctly at the end of this this long circuit so in this latest paper we're discussing today we're using all 127 qubits on an eagle processor running circuits with a gate depth of 60. and are able to get out accurate results as as we'll discuss um so having crossed these two uh 100 thresholds individually it gives us some confidence in our next goals of targeting circuits on the order 100 by 100 or greater so what made this latest experiment possible um well first was just building a system with 127 qubits expanding building these chips and growing the processor size is a major in the hardware and software stacks um key a key aspect of this is coherence improvements so even among different generations of the same one 127 qubit sort of model processor an earlier generation Washington had the median T1 of about 100 microseconds um while the device used for this experiment uh Kiev had a median T1 in the ballpark of 250 microseconds so that gives us significantly more breathing room to run the experiment also the advances in device calibration allow us to improve the performance of our gates um so for example if we are calibrating the device we used to calibrate each of the gates sort of individually so that's something of a local optimization of the performance of the gates as we're developing this experiment became clear that if we know in advance which gates are going to be run simultaneously such as all the red gates here we can use that information in our calibration calibrate with all of calibrate that entire layer of gates and get some boost in performance that way by doing more of a global optimization so here's uh one one uh plot indicating some of the improvements the result of that and other improvements that boosted the reduce the median gate error below one percent and finally improvements and noise modeling and error mitigation have been helpful and this includes both scalable characterization methods for learning the noise on the device and also more accurate methods for more accurately amplifying the noise for the zero noise extrapolation procedure so I'll spend a few slides discussing these maybe Andrew a quick quick question before you dive into the PC slides um Brooke you know how much of an improvement did you roughly see with this context aware calibration I might have to defer to Young slick on that as he is more of an expert in that particular detail um yeah if I come in um um so one of the biggest benefit we can get from this context aware calibrations that um previously we didn't really know whether there is some additional error coming from running these two crickets simultaneously for instance there is some spectator related error or some multifunal process involved inducing some significant amount of leakage for instance those are those didn't really captured in previous individual approach but in this context of where calibration scheme they it is possible to catch that and we can factor that into calibration scheme right so we often see some of the like outliers particularly bad when we run everything simultaneously but we can we can capture that kind of Errors okay so it's like a five to ten percent Improvement say ballpark uh yeah probably no worries okay thank you uh we press forward all right thanks right so going into learning about uh the noise uh we'll think about how we want to construct our noise model and first of all we'll think about what how we want to model the noisy circuit we're trying to run um so we'll make a series of uh simplifying assumptions uh where try to be judicious about this um so so first thing is that uh typically the two cubic the two cubic gates are the dominant noise Source the one cubic gates are generally much higher Fidelity so uh we'll assume we'll neglect the small amount of noise that's happening in the single cubic Gates and just modeled the two cubic Gates is containing the noise um to do to do this will represent the actual two qubit gate that's run in terms of an ideal version of that gate um and a noise channel that contains all the result all the effects of decoherence control errors crosstalk and so on generally things going wrong now a general mini qubit noise channel has an enormous number of parameters that one could learn it's very general there are a lot of different things that could go wrong um so another simplification that we can do this this is a representation of it in terms of an object known as a poly transfer Matrix um many parameters are not shown here um a simplification we can do is to use poly twirling so that's discussed in some of these references and we'll introduce each time that we run the circuit we'll introduce a series of random single qubit poly Gates rotations just before the each two qubit layer and then we'll follow the layer with the appropriate polys such that the the two layers of polys cancel on another so in the case of an ideal in the case of an ideal circuit um or excuse me and this these two layers of polysus have no effect on the behavior of the ideal circuit but they do have an effect on the average behavior of the noise for example you know sometimes a qubit will go through this layer of polys and it'll be pointing up and other times it'll go through it'll be flipped here by an X poly and be pointing down and this 50 50 probability leads to um and there are other policies of course but this these 50 50 probabilities lead to on average many terms in the noise canceling such that only that on average the diagonal terms and the poly transfer Matrix survive so that's a first big simplification but there's still an exponential number of terms to learn in the noise so we'll make a further uh simplifying assumption which is that errors are generated locally so we'll say that the poly channels that comprise our twirled noise layer are either weight one or weight two so errors are generated either on individual qubits or on nearest neighbor pairs of qubits and so now this is now a polynomial number of terms in this model since these effects are happening locally on the device we can also learn them in parallel since these are localized uh processes um and so the ability the experiment and number of parameters is uh as retractable now the number of parameters is polynomial and the size of the circuit you could imagine if you were able to just run the noise channel in isolation you might learn these diagonal terms in this poly transfer matrix by preparing a particular poly basis state applying some number of copies D copies of the noise Channel and then measuring in that basis to see how much the that expectation value has decayed it would Decay by the particular relevant poly Fidelity raised to the number of times you applied the layer and reality is slightly more complicated because we can only run the entire noisy two qubit layer so there are these extra gates in addition to the noise channels this results to us leads to needing to make another assumption to address this in details are in these papers so we'll need to run experiments to test whether this is working in practice but but with another assumption of of symmetry is of essentially assumption about the the noise model we can back out the error rates of all the particular local error sources that are happening inside of our noise channel here there was some questions in the audience about the learning model but maybe I can also help out and add we have a there's a kisket seminar that goes into the whole kind of probabilistic care cancellation here from I'll try to post the link in the channel from that one's from me I guess but um if you guys are interested in more details on this that's another kind of more pedagogic place to look thanks yeah if we have time at the end I'm also happy to take questions um so uh from doing these fitting and looking at these Decay curves of the different poly expectations uh and I maybe admit one other detail I'll just add is that we only need to look at the decays of weight one and weight two polys that's sufficient information to get at this information um so that that keeps things trackable um so we can then extract our noise model so this is comprised of the rates at which the different single qubit or two qubit poly errors are generated those are encoded in the color code on this map of the device the three sectors correspond to the X Y and Z rates on each narrow rates on each qubit and the nine small inset rectangles are the rates for the two qubit polys so we've reduced the model complexity from this exponential four to the 127 down to only 1700 parameters then these can again be learned in parallel across the device um so with this noise model in hand we'd like to perform zero noise extrapolation and specifically we like to amplify the noise uh suppose this is our circuit that we like to run we have some single cubic Gates and but and then our noisy layer of two cubic Gates accompanied by the the noise channel that is representing the hardware noise happening on the device um so if we think about what this is doing since it's a poly Channel we can uh I think its effect is equivalent to if every time we ran the circuit this introduced a random poly error sampled from that distribution from that noise model that we learned in the previous slides so since we know how that uh since we're we now have that noise model in hand there's nothing stopping us from introducing our own version of this by sampling from the same noise model and applying random poly errors accordingly if we do this the math works out with these two channels combine to effectively double the error gen the rates at which errors are generated if we're sampling from the same distribution so this would give us a noise gain of two uh there's of course nothing stopping us from varying these probabilities continuously so we can turn that knob up and down to increase the to amplify the hardware noise across the device by any uh any factor that we choose any noise gain factor which will represent sometimes as G um and so just as a quick example of what this looks like this is plotting an expectation value measured on the circuit I'm not too worried about what exactly we're measuring just yet but um the ideal value in this example would be plus one when we run on the hardware with a noise gain of one so not introducing these extra errors we accumulate more data and it averages converging towards near 0.8 so somewhat removed from the correct or somewhat biased downwards from the correct value we can rerun the circuit introducing errors at a rate of something like 0.2 such that the net gain including the intrinsic Hardware noise is 1.2 this is now biased further from the True Result and again at 1.6 again taking accumulating these randomized instances resampling polys each time and converging to a further value we can plot these as a function of the noise gain and fit them with a simple model either a line in light blue or an exponential in dark blue the latter being motivated by some theoretical studies in the literature and find that the X and then use that to extrapolate back to the zero noise condition and find that the bias is largely removed this is a particularly nice example but we do see that this seems to be working well across the range of experiments we've investigated so far and speaking of experiments I will now pass it over to yongsuk to talk more about the results unless there are questions or anything before that but if not feel free to take it away thank you Andrew go ahead maybe a quick question from the audience was on the relationship between this and pec the probabilistic error cancellation method I think there was a little bit of discussion about that maybe if you want to elaborate yeah um yeah first to to try to control the scope I didn't really get into PC but it's definitely a related method uh there we're sampling essentially from a quasi probability distribution um so we uh our sampling from a distribution such that between the polys that we introduce and also the post-processing that we do uh where we introduce minus signs um we're we're effectively have uh the opposite effect from the noise I think I forget the reference exactly but in some of the literatures refer to this as anti-noise um that that allows us to cancel the effect of the hardware noise rather than amplify it on on average now um this does uh while while both methods are expected to have exponential scaling in terms of the sampling cost the number of times you need to run the experiment to get good precision um empirically uh zero noise extrapolation seems to be cheaper at least versus conventional uh implementations of or standard implementations of PEC awesome thank you and maybe since we're the halfway mark we'll take one or two more questions real quick before letting yansa get to all of the the fun results so stay tuned um maybe a quick quick question here from um from she Shan I hope I'm saying that right you know the results are really nice I'm particularly impressed by the great use of error mitigation at this large scale at the scale of hundreds of qubits you know cold atom Community has also done some work with some good results and expectation values would are those also would you call those utility scale as well uh I guess I would I don't want to say anything wrong so I might have to uh look at the specific reference to say something with confidence my impression is that there have been things with uh sort of more specialized Quantum simulators as opposed to Universal quantum computers uh looking at simulating Quantum systems at these scales so that's one distinction I might draw with this work but certainly yeah yeah I don't know I I know there's a lot of great work going on in the cold Adam community okay um thank you Andrew and she feel free to elaborate um I know you've got a lot of questions not sure we can get to all of them right now we'll save some for the end of the talk so stay tuned maybe one more from Arthur Strauss is the method for error mitigation here coupled to the context the word calibration to enable better capturing of parameter drift that occurs over time when you have to take many many shots so I think it's basically a question about the drift in the systems um yeah I know uh I expected some really nice work on that I don't know if he wants to show slides later uh getting into that or not but um yes parameter drift is definitely a real thing um and so we found it necessary to periodically recalibrate uh and and relearn the noise model um like during the experiment on the order of hours good maybe young suck you'll get to that in the second half of the talk so uh I take it maybe we stay tuned for that and then maybe final question before um we we uh hand it back over to you young suck to tell us more about that from Diego Emilio um I think this so basically the can you clarify the difference between utility and Advantage um you mentioned this work is not a claim of Advantage what's the difference I think we're trying to with utility we were trying to draw a distinction I'm trying to suggest a break with the conventional wisdom that we would need to wait for fault tolerance to realize uh useful results from a quantum computer that was the main main thing and we think that running applying the the fact that we can apply a error mitigation successfully at circuit volumes Beyond where you're guaranteed or exact result from Brute Force classical methods is encouraging in that direction I hope that someone clarifying at least yeah thank you Andrew and um Diego feel free to elaborate in in the uh comments and discussion here and folks feel free to you know post your opinion as well you can mention by the way Diego says great thank you for the answer so uh great to see that Lively discussion I know there's maybe four or five more questions and I can't get to all of them right now so folks I apologize but we'll bring them back up at the end of the talk um and maybe in the interest of time young SEC the stage is yours all right yeah thank you Andrew it's not cool um so uh I'm going to talk about remaining part of the talk mainly talking about what we did experimentally so for that I'd like to talk about what what we did as experiment basically we take this uh transfer field using model hamiltonian it's describing the spin Dynamics so we have a nearest neighbor interaction term J transverse field term H and we map our spins to exactly the hardware topology as you can see here um so we take a look the time dynamics of this system so we totalize first in first order of this this hamiltonian to get a territorized time Evolution hamiltonian so here we can now decompose our hamiltonian into two parts rzz gate and then RX gate so basically our circuit looks like single cubicate application followed by a series of two cubic Gates and since we have three way Connections in maximum in our topology we have these three collection of two Cubit 3D layers and we go ahead and learn all these three layers and we performed our experiment accordingly and if we repeat this structure circuit twice that represents second order step three times that represents the third step three okay so uh and one more one more thing here um since uh our experiment is more like uh verification and capability perspective we are interested in comparing against the ideal solution of this structuralized circuit not the ideal solution so we are ignoring characterization error here and um so when it comes to really realizing this circuit in our Hardware we need to decompose this rzgate into our native Gates which is C naught um and in general we can decompose this rzgate into two synods but at specific angle such as minus pi over 2 here or it can be decomposed into one Cylon so you can drastically reduce the length of the circuit or more current time budget so uh but but meanwhile we still can explore other circuit parameter because we have one more parameter H so now we are varying this parameter H or single cubic gate rotation data H to explore different circuit parameters that would give you this angle yeah yeah so um yeah that's what I mean by um here uh we just first order to Thrice so there will be some characterization error uh but again um rather than looking at some interesting physical phenomena such as phase transition or something uh here we are really focusing on verification purpose so uh we are ignoring the torturation error at this moment and we are just comparing against this turquoise circuit with our experimental results okay thank you so um yeah one of the circuit if we take a look that we ran it's a 127 qubit 16 tangling layer and if we compare that against um one of the early experiments done by IBM 4 Cubit two steps to it 2017. um you you see uh quite a lot of improvement here so this kind of illustrates how fast this field is evolving which is pretty interesting so yeah coming to this diagram so I earlier mentioned that we can explore different circuit parameters by varying parameter H or single rotation data h and by varying the parameter we can visit different circuit range regime we can make our circuit Clifford meaning we can efficiently simulate our results even though it's 127 qubit scale so that's a very nice verification point and outside this circuit parameter of course it's a non-clipper circuit so it's a little bit challenging it's challenging to obtain exact solution there but if our QB size is for about 30 to 40 it's still in good for similable regime so that we can still get exact solution in that case if we have reduced reduced size uh so with that in mind let's first check the Clipper circuit to build our components what I mean by that is we fix the single commutation angle as a zero and that makes our circuit Clifford and we go ahead and learn this layer one two three collection of cnots and here is the um uh results y-axis is average magnetization over 1270-bit device x-axis is throttle step so we go on third step up to 20 which is equivalent to 60 cylinders and you can see the media signals are gradually decaying deviating from ideal value of one and after we apply the mitigation scheme it shows improved results uh so you can see it's not exactly hitting one there might be various kinds of reasons for that such as some model drift that um that Andrew just mentioned or some inaccuracy in our model there might be some possibilities but we do see a better Improvement and reliable estimation on the observable we are interested in so with that um we'll be moving on from Clifford circuit and we want to try out non-cut circuit so for that we fixed the depths to 15 entangling layer and we now vary the single cubic rotational angle Theta h so this is basically the same plot y-axis is the same average magnetization but now x-axis is a RX angle so we are varying the circuit parameter and we know that at 0 and pi over 2 is Clipper circuit so we know the exactly what is supposed to be which is one and zero here and in between it's smoothly connecting these two limits but we don't have exact answer so we don't know whether our answer is correct or wrong um so again um it's a non-clipper circuit it's not because for Clipper circuit Beyond d0 and pi over two um so it's challenging to get exact solution so that's why we approach to our collaborator Berkeley group Professor Michael zellatan's group and sajan um and ask about question whether they can verify our circuit so um this type of problem in classical approximation Community um tensor Network method has been one of the favorite methods to tackle this type of problem so tensor Network method is a to express is a way to approximate your target wave function with physical indices here it can be approximated to various type of tensor Network method such as MPS and peps and different structures of these tensor Network method supposed to capture different structures of the entanglement present in your system and among them MPS and peps is one of the most famous popular method and so they choose MPS and ISO TNS which is a variant of perhaps and one more part important parameter to mention here is Bond Dimension so Bond Dimension is a quantity measure of how much entanglement we can express with your tensor Network method for instance with this NPS method we can think about two extreme when Bond Dimension equals to one its product States we don't really capture any entanglement present in the system The Other Extreme is exponentially large Bond Dimension where we can exactly capture all the possibilities but in reality we take the bond Dimension somewhere in between uh depends on the memory requirement or computational speed requirements so on and so forth so the Berkeley group go ahead and using NPS and isotenus with certain Bond dimension they start to produce results and we compare against each other uh so here um by varying the single Cube rotation angle from 0 to pi over two we are systematically introducing more and more entanglements present in our system so when there's a last entanglement present in our system they all agree in a reasonable degree two different methods as well as experimental results but a certain point you can see as we introduce more and more entanglement in our system um they start to deviate to each other experimental results and numerical approximation they debate each other also among two different numerical approximation methods they also deviate to each other so um at this moment we don't have exact solution so we don't exactly know which one is right or wrong but uh during the discussion during our discussion um and during surgeon they they realize that there's a concept of glycon so basically as we apply entangling gate to nearest neighbor qubit the entanglement propagate so much that only those qubits within this licorn really matters for this particular local observable so in our case um it's for about the circuit is relatively shallow so it's for about like 30 Q one cubit or less so that's a range the reason where we can tackle with good for simulation um yeah so um go ahead and calculate the exact solution using brute force method using this right lycon reduction technique and here's the exact simulation results and you can see there's a nice argument between experimental results with this uh exact simulation results um and it seems like uh near pi over 4 or pi over 2 or these regimes are really some regime that we are interested in perhaps and seems to be important but it is not super clear by looking at pi over to a point whether experiment is really producing zero or it's just depolarized and being zero so we delivered deliberately choose another way 10 observable which is supposed to produce one at pi over two point and there we still can get the exact solution using licorn reduction technique and we saw a very similar Trend here so experiment more or less shows some agreement with exact solution whereas a pure MPS and isotns Method start to deviate a certain point and here's another point we utilize the learn noise model and we perform Clifford simulation and there we we checked that the Clifford simulation results are producing um pretty close to what experiment produced actually for immediacy alone within the experimental uncertainty this is kind of like another check whether our assumed noise model that Andrew mentioned earlier is reasonably representative um so this can be also in other words can be used as a some sort of tool to check whether or you have this circuit you know the noise model whether this circuit is reasonable or feasible for current Hardware or Not by just running this simulation okay so with this track we go a little bit folder this is way 17 observable which eventually includes more qubits within the Litecoin and uh we cannot really simulate this example with Brute Force manner um even with this likely reduction technique so sajan go ahead and actually were able to simulate still stimulate and obtain the exact solution using um uh using liquid reduction MPS MPS method so there we still saw a very similar Trend here very consistent um with the other earlier two examples yeah so these are all um we some results that we want to share um so basically from this observation we kind of learned that Luigi quantum computer is still able to produce accurate expectancy value at a scale Beyond Brute Force computation and this serves as a we think this serves as evidence for the utility of quantum Computing default to learn okay so yeah of course um we can we can push a little bit further now that we know that it's producing reasonable results we can push a little bit further Beyond exact verifiable circuit meaning we don't really have a exact solution anymore uh for these two circuits so basically what you what we did in the left hand side is that we effectively evolve one more toward our step um and we we eventually don't have accessibility for exact solution anymore but there on pi over 2 we know it's a Clipper circuit and we know it's supposed to be one where we see the experimental value is is close to one whereas the approximation values are largely deviated from one and another example is in right hand side is that we take a look local observable but this time we go all the way to depth 60. um there uh the likely eventually includes all 20 127 qubits uh again we saw a very similar Trend where they agree each other very well at the very beginning and they start to see shows these kinds of like discrepancies between methods um so at this moment since we don't have exact solution we are we are not so sure whether this fast Decay is related with oh maybe we go to the deeper depths we have all these noise sources and it's maybe because of the noise we don't know whether it's the right answer or not anymore right uh but fortunately after we put out these results um there are a lot of like very active discussion In classical community and they end up proposing various different types of um numerical approximation method some variant of the tensor Network methods um and they show a very interesting Trend here these are like six different um numerical approximation method you can see it's the same circuit the same example uh for instance one can maybe just Lively ask well we you have a pure npn stimulation results with certain Bond Dimension uh why didn't you apply similar type of extrapolation technique so that's that's done by our collaborator from Berkeley which is this purple paper so there are there are like more more various kinds of tensor Network method proposed and uh the trend is that they all agree very much when last entanglement present in the system near the zero angle but as we increase the um integment present our system and we make the circuit non-clifford for instance around pi over four points the various methods kind of mutually start to disagree to each other at 20 level and if we overlay our experimental results on top um some of the methods shows some sort of agreement within experimental uncertainty so um this is this is pretty interesting so we are hoping that we improve our experiments so that we can provide more clarity regarding this also I believe the classical Community can also um improve and provide more clarity on that by for instance spending more computational resources and increased Bond Dimension and so on and so forth so um this back and forth interaction is actually something we were expecting expecting and we think this is really beneficial to each other in the absence of exact solution one can perhaps use others method to check each other if we are aware of what is assumption and approximation behind each other's method this could be really fruitful discussion and this continues back and forth discussion will eventually inform what kinds of circuits put classical method really challenging and this is exactly why we are interacting with external partners for 100 by 100 challenge to identify challenging problems in various different fields okay so in in terms of experimental side improving experiments there is some some optimism um so left hand side is T1 plot so Eagle revision one and revision two that Andrew briefly shared in earlier's presentation there has been some improvement up to four milliseconds in some research device so there's some hope also this is two cubicate plot there has been some improvement as Andrew mentioned earlier as well but there is a slightly different 2K architecture that shows older augmented Improvement in terms of two cubicate fidelity so yeah there is some some optimism regarding um future Improvement so okay so this is a take-home message last slide um so knowing we observe that a noisy quantum computer is able to produce accurate expectation value at a scale of Beyond Brute Force classical computation this serves as a evidence for utility of quantum Computing people for tolerant and we kind of expected this active back and forth and we has been observing between classical and Quantum community we are hoping that Quantum experiments will motivate advances in classical simulation algorithm community and and vice versa and we we view this as a as a some sort of tool for testing near-term Quantum algorithm and we hope that this will help and encourage to explore um Quantum applications at non-trivial scale in near term and um since we built this trust in our device and mitigation method um we can now start to think about carefully how to choose Hardware circuit for instance and more powerful error mitigation methods are in oven and and as men and dimension earlier we put out this 100 by 100 challenge and impending Improvement is in in progress scale quality speed we hope that this will folder increase the our access to uh circuit volume that we can tackle the quantum device in in some continuous fashion so yeah um this is end of the presentation thanks for your um attention and and presence I appreciate that thank you thank you so much Andrew and young suck for this great uh back to back uh or two-part presentation uh we we I think it's a really treat it's really awesome to have that also thank you I'm enough uh Candela for answering questions here in the chat from people uh there was one question which maybe it's good to just bring to everybody's attention abena thank you for giving it a quick answer but it's what do you mean by Brute Force I think I would have said you know basically matrix multiplication uh but I know that question has come up before so feel free to add if you if you want uh or or we can go jump to the next one and folks great time to post your questions or repost them in the slack Channel and I'll bring them up to Andrew and young suck yeah um I think Brute Force means uh typically this type of numerical approximation is a method a trunk it certain degree but food force is really just matrix multiplication type of calculation that try to capture everything uh which is exponentially expensive as you scale your problem size okay great um and I guess you don't mean sparse matrix multiplication um quick question from well maybe quick question a question from Miguel um in the tractable cases you mentioned or maybe just more generically in the cases you mentioned how do this simulations compare of the classical and quantum computers when it comes to runtime or cost or memory use in other words you know accuracy is one objective and one metric but there are things like cost or runtime Etc or can you tell us more about the comparison there yeah um so we actually had some discussion related with that in our paper um for instance we take for instance this this particular experiment and take a look how much time we spend to produce each individual points in Quantum and classical computer resources um it's for about take like three hours to produce this point um three hours is excluding the learning time so it might be a little bit longer than that but more or less three hours or so a four hours or so um and these these points are in classical research side produces uh this point takes like nine hours or so so it's not like drastically different there are um comparable I say um but I think uh at least we know what what's going on in the quantum side so if I mention a little bit more about Quantum side um there are there are many rooms for improvement the actual Quantum machine running time is much more shorter um there are a lot of processing time in classical resources such as like generating pulses loading the pulses to the instruments so on and so forth so I say there's there's a large room for improvement in Quantum side thank you young SEC and thank you guys for posting questions please repost them or bring them up this is a great time to post your questions I see a few already queued up here um maybe a quick question from Jim young Kim I hope I'm pronouncing that well I apologize if I'm not uh among how did you choose the expectation values you're quoting why why those why not some other ones what about higher weights Etc right that's a really good question uh first I think uh we are kind of like physically motivated to look at magnetization that's something you can like immediately think of and averages my intention uh is is therefore a good representative value to describe this system right rather than just like listing every individual things and another motivation to look at this High rate observable is that um it's actually a stabilizer uh where this particular Clifford circuit produces either one or minus one so this is the lowest weight stabilizer so basically it is way 10 obstacle it's supposed to produce one um and and this is another stabilizer which is a little bit higher weight than weight 10 this is weight 17 observable which is also supposed to produce the value of one at pi over two and this is really like different choices of observable with one here like stabilizer here with 10 obstacle here is that we want to explore like different circuit regimes and check for instance it's supposed to produce one in zero that's different circuit range um uh pi over 2 as I mentioned earlier there's some suspicious that well it can be completely depolarized experimental value might not be valid right so that's why we take a look at this another observable whose value is supposed to be one at pi over two excellent thank you um maybe a question from cinewassan why could you not apply probabilistic error cancellation PEC to these calculations yeah and you do want to answer sure um so at least with the most straightforward application of probabilistic error cancellation where you cancel every error in the circuit at every layer in the circle so on every qubit and on every layer in the circuit um you know that you can you can estimate what the sampling cost is um with without running The Circuit by looking at the properties of your noise model the error rates in your noise model and essentially multiplying the cost of canceling every gate in the circuit um and so this grows exponentially with the the circuit volume um and for these circuits the cost that you get with that implementation is is sort of like astronomical it's measured in the time to run the experiment is measured as a large Factor times the age of the Universe um now you you can uh you could have that was part of what motivated us to move towards their noise extrapolation there may be room for improving on that uh by uh taking into account these uh light cones when implementing PEC so that's discussed and uh uh for example on a paper I may still be on the archive I don't know if it's been published yet by uh uh Min Tran and and collaborators uh recently looking at how you can if you only cancel the Noise Within the sort of causal light cone of the observable you can cut down the cost of implementing it significantly I'm not sure exactly what the cost comes to here I think it's still going to be more expensive but I don't have the number offhand good thank you Andrew and uh yeah folks there's a link in the channel if you to a kisskin seminar talk on on that bound NPC if you're interested to see how that scales good um question from rui how Lee great talk could you maybe comment on the prospect of this kind of experiments for spin models Beyond uh the icing model like the Heisenberg model um I'm not a total expert in spin models but uh I think that there should be some flexibility in updating the hamiltonians used to define the Trotter steps um so I think it could be an interesting future Direction yeah I think maybe the the I can add a quick comment there it's it's a question of depth there just more expensive in depth basically um great um okay a question from VTR how does well let's not post it up because it's a little funky the way it's phrase but how does the scale is the basic question uh how does the scale two larger and larger circuits I think that might be the the basic question here that is being asked yeah uh it's going to have an exponential cost um I think that the name of the game in near term quantum Computing is is the exponential cost affordable enough to run a circuit that is interesting and useful and hopefully as hardware error rates continue to come down and our understanding of their mitigation improves perhaps that will get us into the regime where we can start getting some value out of the machines okay thank you Andrew quick question from uh nam I don't know why I keep saying quick question on every question question from Nam thank you very much I see lots of collaboration which is awesome great job by the way guys I'm wondering how long did it take to finish this project I see um I I would say like on your time frame but um it really depends on which snapshot you are looking at um I mean this this project has been um continue there since we try out I mean we means IBM try out 2017 um like observable and try to apply similarly establish in 2019. and we try to scale that in 2021 and now we try to really scale in a we call it utility scale in 2023. so the project itself has been really like there to continue the research but uh I guess this this particular paper and collaboration is like within a year time frame I guess okay thank you young SEC um I think this was a question from she from earlier in the talk in the comments uh one can also Imagine performing error mitigation extrapolation Classic Luigi by performing a series of simulations with increasing Bond Dimension and trying to extrapolate from those um can you tell us more about that kind of line of thinking yeah I think that's what exactly our collaborator from UC Berkeley did here they take a pure MPS method so they they try really similar projects traveling out of different Bond Dimension so different Bond dimension uh so basically MPS is controlled approximation meaning as you increase the bond demons to infinite degree it will be accurate so the underlying reason behind of extrapolation is that this this one dimension in Infinity limit will produce some accurate results hopefully so I think Berkeley group tried two different way of extrapolating it like XX between one over the spawn dimension or they also tried out um Fidelity as an x-axis um and and that's the result of the this blue uh this purple code you can see here there is some instabilities observed near pi over two uh but you know this this can be considered as one of the possible or potential technique to tackle this type of problem but yeah yeah okay wonderful uh maybe we'll take one or two final questions here as we're running a little bit over time um Andrew coming back to the earlier question of scaling could you clarify what you mean by exponential cost like what exponential and what or what what is the cost you're referring to yeah this might be a more involved discussion but I'll try um uh in terms of the the uncertainty and the exponential extrapolation I guess used in CNE if the poly errors within the causal light kind of the observable or causing are generally going to cause the expectation value that we're getting out that's biased away from the True Value uh to be causing that biased value to be to Decay closer and closer to zero um including the different noise amplification levels then they'll is that that exponential decay I think is giving you an exponential cost in your ability to resolve that point and also to resolve the differences between the other points such that you can uh meaningfully fit and exponential curve to those points and have a reasonable error bars on the extrapolated result there are also questions of whether the separate but they're also that's the scaling part of this there's another question of you know how how how what is what's the domain of validity for the exponential fit itself I think the general form uh can be written as like a sum of of exponentials according to one of the papers referenced during the talk um so you may need a more complicated fit model or you might even need to go you know over back to probabilistic error cancellation so you have more of a theoretical guarantee that you're properly compensating for the errors excellent thank you Andrew um coming back to to Arthur's question on context the word calibration does context the word calibration address both spatial and temporally correlated noise or only special in other words crosstalk occurring in the parallel Gates uh so calibration this particular calibration is done by just focusing on one layer so I would say it's not I would say it was not really temporal correlation was not in mind it's rather spatial correlation okay thank you young suck folks if I miss the question I apologize dearly feel free to repost it I think we'll take maybe one more um maybe this is a big picture question from Psy feel free to be imaginative here from say where can Quantum utility be used this kind of quantum utility be used in the future the first thing that comes to mind is the list of uh possible applications of expectation values and near-term Quantum algorithms um such as chemical simulations and optimization problems and machine learning kernels um I'm yeah I've mostly been thinking about getting the cloud on computer to work these days rather than what we will do with it once it's working so I don't know if I have a totally satisfactory answer for that but okay great young sug do you have one favorite application area you want to share uh I also my answer is largely the same way then too Andrew okay I'll share mine then mine as many body quantum physics let me just put out a paper on it this Tuesday so check that out if you're interested in materials and this kind of thing uh that one's also 124 qubit uh thing okay I think folks I see more questions coming in but seeing as we are almost 10 minutes over and we've had what 25 30 minutes of questions probably for Andrew and young suck here maybe this is great time to uh thank Andrew and youngsek for the wonderful presentation today I saw a lot of comments from folks in the chat saying great talk great presentation really clear um and I think uh with that I'd also like to uh thank everybody who's tuned in and has been tuning in for the last 130 seminars for making it back to the 130th uh folks this talk will stay recorded so you can go back and re-watch it on anything you missed but to know what's coming up click like And subscribe if you enjoyed it and with that we will see you next Friday at noon Eastern Time
Info
Channel: Qiskit
Views: 3,744
Rating: undefined out of 5
Keywords:
Id: hIUydsivY9k
Channel Id: undefined
Length: 69min 13sec (4153 seconds)
Published: Fri Jul 21 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.