Okay. It's the amount of people do it, not the actual it's not the chance of the thing happening. It's the number of people that do the thing. We're measuring. The number of people doing the thing. Yeah. So in this one it's the number of people who say they support her, okay? So that we're interested in her level of support. The thing we're testing the test statistic is the number of successes, okay? So it's always the number of people that give a success or the number of things in always people is I guess number of successes, okay? In a in example one, sorry, it was the number of heads because we were looking at cand if it was biased about heads on up. In this one, we're interested in seeing if the support for this candidate is 40% or isn't 40%. So the test statistic is the number of people who do support her. Successes, Yeah, pretty much. Yes, it's always the number of successes. But if they ask you that they want it in the context of the question, okay. So yes, it's the number of successes, but you look at that and go, okay, what's the success? Success in this person's eyes is someone supports her. Yeah, okay. Okay. And then number part b, the hypotheses. So what's our normal ll hypothesis is going to be. It's us probability of a of a person supporting those 0.4 really yes, p equals 0.4. And then what about the alternative hypothesis? It's that this one's hard. It's that the probability does not equal to 0.4. So this is actually this is still a one tail test. So this is either a gross than or a less than Oh, but it's really it's actually really, really tricky. But where like which way it is based on the in the word we're looking at is that because that's the key part we want to see, we think they're overestiating what the researcher does. So if she's overestimating, would that make p greater than 0.4 or less than 0.4? I'll make it less than point for Yeah good. Yeah so our hypotheses are that completely get what you said not equal to and some of them will be that but we get onto the two side tests a bit later because they don't they don't quite come just yet. But with your alternative hypothesis, it is always it's just going to be one sneaky word in the question such as overesmates when you go okay, so that means p is less than 0.4 for the alternative hypothesis. Okay, brilliant. And then c, we've not looked at this sort of question yet. Okay. What can we lithe when we talk to that full, full question? For example one, but we didn't have a question like this in example one. So explain the condition under which the null hypothesis would be rejected. So if we were to do this hypothesis test, go. Oh, no, that is okay. Okay. So if we were to do this hypothesis test, what would we do next? And then we just trying to like just use the thing good. And what would we actually be trying to find out? So would say, first of all, x is bonly distributed, wouldn't we? Yeah have we got a number of trials? 2020 good. And then what do we put for probability? Yeah, good. We assume the normal ll hypothesis is. And then what we're actually trying to work out because this is the key part, what do we want the probability of. Oh, of the support of the ender. Sorry of like so in the question, we're out of those 20 people, how much support did they have? Three over 20I did that three. So something to do with three because three is the key bit here. But jowe did the coin. One. We we talked through it, but we didn't say we want x to. So I think we said there were five heads. We didn't work out x equals five. We did something else. Oh, bigger than five. Thanks, kes. Bigger than five? Yeah. Why do we do bigger than five? Because that's the thing that we want to test. So it's not always bigger than five. It won't always be greater than but so why did we go greater than last time? Because do you want to see if the coin is biased towards so do we want a greater than this time or do we want a less than this time first then okay. Why no? I mean, 嗯,Oh my God. It's. The word have be used. So we just want to prove that she's overestimating, but only the other 20 people support them means that. It's already my estimate here, right? So we're under her estimate Yeah because 40% of the 20 would be over three be alive over actually. But we don't just work out the probof, just three. So the word in that we use is always three or more extreme. Okay, so in which direction is more extreme? Less than three. Yeah, good. We want x is less than three. Okay? And look, the way you work that out is it's always away from the expected value. So you could do, which I think you've been doing sort of mentally 20 times. Point four gives you eight. So in a perfect world, if she's got 40% of the support, then eight people would say she's their supporter. Okay? So it's always away from that. What that expected value that mean if you want to call it a mean, okay, same within the coin one. Then we add 0.5 and we did eight toss is you expected four. So when it was five and we wanted five or more extreme, that's why it was greater than five. Good. Okay. So Yeah, we do this. We don't we don't actually have to do it in this question, but we get some number for that. We work out that we get some probability, then what do we do? Then we do, and then we compare it to the significance level. Brilliant. We compare it with that 5% significance level. And then that takes us to this question. So what probability do we need to be able to say it rejects the null hypothesis? It rejects the north. Yes. So what probability do we need here as an answer compared to our significance level to be able to reject the fact that she has 40% of the support. We need to be than brilliant. Yes, exactly that. Yes. If we get a value under 5%, thatwill say, well, essentially what that does is we go, okay, if she's got force parensupport the chance that three out of 20 or more extreme support her is really, really unlikely to happen. Therefore, if it's really, really unlikely to happen under the null hypothesis, we assume that the null hypothesis isn't right. Okay? We can't say with certainty because this could happen. You know I could three people out of 20 could say the supporter even at 40%. But if it's really, really unlikely, we go, Yeah, it's probably not happened. She's probably got into that because exactly what they've said here. Okay, so they've said the non ll hypothesis will be rejected, the probability of three or fewer. So that's what we got. People say they support the candidate. And if it's less than 5% again, then that's what you said. Okay. Given that this, yes, we always assume the knowledge bothesis is brilliant. And then seven a is just all about can you explain how a hypothesis test works? Can you find the two hypotheses? Can you define the test statistic? Which Yeah, I think I think you can. There's not there's not a massive amount to these to be honest. They get woryet. But these are just like the two examples we've done that are wordy. I want to skip 7.2 with fact. First, I want to talk through where Marks are on. So if we had this question and it was instead of abc, let's imagine the question was just test at the 5% level of significance the researchers claim then. So there's a markup for grants for the hypotheses. So that's mark one. Now you do that nothing to actually get one out of four, okay? They're usually at four, maybe five depending on how Nast sty it is. Okay, you should also you should define the test statistic. So you should always go x is number of people who support. But it's it's not necessarily worth a mark. Okay, but it's good practice for them. Your next solid mark is for working out this probability. Okay? So if we went, Oh Yeah, that equals, I don't know, let's pretend it, 0.04. Okay? If it is that, that's mark two. So you actually do the matyou get the second mark? Okay, the third and fourth mark come for your conclusion. So it's that paragraph that you write at the end. So the third mark comes for correctly compare it with the significance level to give the correct outcome. So sort of like in c, so let's say it was zero zero four yougo. Oh, this is less than 5%. So we can reject the null hypothesis because that's mark three. And then the fourth mark is to put into the context of the question. So you just have to look at the question and go, okay, so if let's imagine we do reject the non hypothesis, you could then say there is evidence to suggest that the election candidate has less than 40% of the residenboard. Okay? So not not too difficult to do, but that bit changes a lot per question because you've got to use whatever scenario they're given you. So this test statistics 's not necessarily worth the mark. Same with this line. But you still want to write them in because they're valid to your question. Okay. So at this point, I like to skip critical values, and I like to actually look at the full hypothesis tests, because I think once we've talked through the whole process and done the first half the process, I think it makes more sense to do the second half of the process essentially, which is where this one comes up. So one tatests, we'll start one Tawe click, two tathis is basically bullet points for how it works. We've already talked through this, but Yeah, so you you formulate a model for your test statistic, you identify your hypotheses, you work out your probability of the thing happening, you compare it with you stiits level, and then your your conclusion. Okay, as this bit says down here, you can technically also do it with critical regions and then see if your observe value falls within the critical region. However, since your calculator does all the nasty stuff for you and youhave to use a table, it's quicker not to use critical regions. Okay? So back when I did this, critical originons was a genuine, valid method because youlook up on a table and yousee all the values there anyway whereas you you you don't need to do that. So if you can avoid using the table and critical regions, I would recommend you do. Okay, so you'll see the method will be here and we'll we'll come to it eventually, but I don't think it's as useful as the textbook makes out now because you calculated as the bulk of the maths for you. So Yeah, we want method one. Okay, so method two will look more. Okay. So example five of them. So standard treatment for a particular disease has a two fifth probability of success. Certain doctor has undertaken research in this area, has produced a new drug which has been successful with eleven out of 20 patients that claims the new drug represents an improvement on the standard treatment tests, the 5% level of significance, a claim made by the doctor. So this is. First look at a full question. So that's what that's the wording of a full hypothesis test question. That's that last sentence is how they always get it's always test at the one f percent significance level, the claim made by the person in the question. So how are we going to start this off? We state the test statistic. Yeah, Yeah. So what's x? Let's call it x. What is it? I scroll it up a little bit. Is that okay? Yeah, go for it. Yeah. So that's what I did a minute again. The the test statistic is. Number of people. Sorry. Yeah Oh okay. So Yeah, the number of patients are successful, always the number of successes. Yeah okay. So in this case, a success is a patient gets treated successfully by the disease of the disease. Okay, youhave to write it a full sentence just like that. Just like Yeah x is the number of patients where the drug succeeds is plenty. Okay? They've also defined out what p is. So again, another, you don't need to, but if you don't do it, the maths isn't technically complete. Okay. So I would always start with x number of people job worked on. And then pay probability. Of success. It's always probability of success. On a person a, they don't call it person, still hold of the column. Patient found it okay. What comes after that? To ignore what they've done because I wouldn't necessarily do it in their order. Now hypothesis, Yeah good at H not ught and H1H not ught. What's H not. 0.4Yeah good where's the zero four come from? The two Oh Yeah two fifths and the H1. And H1 is. Hmm. A ability bigger than zero point. Good. Yeah. Why? Why? So I can say that's what it is down here, but why is it that Oh well, minute, let me write the floor because doctor claims that the neujob represents an improvement. George. Yeah. So so Yeah, the the only bit we get there is the Oh, is improvement. Improvement tells us that he thinks it's over 40 chance of curing the brilliant. Okay, now what do we do. Now we work out the probability. Good. Yeah. So before you do that, this is sort of where they write this. But here I would then write, assume H no is okay because for that fundthe probability we do it under the null hypothesis. So we're assuming this is then I'll write, Oh, that means x is binomial or sign number of trials. 20 good. And then our success is whatever p is or p so point four, still okay. Then jump into the probability. So what are we trying to work out the probability of. Probability of a patient of like wasorry patient getting sorry probability of x is than sorry. Sorry, I'm bigger than eleven, bigger than eleven. Nearly. We also want. To look at eleven, you Yeah good. So that therealways be a greater or equal to or a less than or equal to because it's always the value observed in the question or more extreme. Lovely. Okay, now it comes into our binomial distribution skills. How do I work out the probability of vine. One minus good x smaller than eleven good. Can I calculate or do x is less than eleven? Yeah that Oh sorry that we let me just so we need it to be a less than or equal to so itbe less than or equal to what value? Less or equal to ten. Yeah, good. Less than equal ten and then less than or equal ten we can work out. Okay. Oh gosh sorry, one sec. Okay. Yeah, it's just I forgot to Press something. 0.128. Yeah, good. 0.128. Okay, now one and then we level. Brilliant. So is it greater than or less than the significance level? It's greater than the is Yeah, good. It's over there. Was it 5%? Yeah. So what does that mean? Yeah, good. As insufficient. Evidence. To rejectaand. Then last bit put in context to the question. To reject that. I has been there is an improvement on the on the success of the new drugs. So is there an improvement? There is there might be, but there isn't. Oh, my gosh. But there's insufficient evidence to prove that there has been an improvement on the new drug. Insufficient evidence. My Heyes looks awful on here. Insufficient evidence. To suggest this new drug is better. Brilliant. Yeah, exactly that. Okay. And that is a sort of a standard binomial distribution hypothesis test question. That's how theylook and this is exactly how you want to lay out. So to find your x and your p, get you two hypotheses. Assume H note is work out right down your distribution, work out the probability of the thing in the question happening, or more extreme, very significlevel conclude and then conclude in the context of the question. So if we look at that conclusion they went with, there's not enough evidence to suggest that evidence to reject H, the new drug is no better than the alone. But it's strange because I bet in your head, because I always did, whenever I look at this question with students, eleven out of 20 is loads. But everyis really surprised by this. Looks they go eleven out of 20, 40%. Yeah, that's an improvement. Eleven is way over eight, but actually eleven out of 20 or more extreme still happens over 12% of the time, nearly 13% of the time. So it's not actually that unlikely to happen even though it sounds sounds quite quite good. He will get some of these questions like this that there's one later on with a coin flip that is really, really surprising as well actually. But you actually need quite a high number. So it's like the number of patients it needs to be quite high before you go actually Yeah now we can say the new drugs better. Okay. So it won't this is one of those questions where it won't always the answer won't always agree with what your mind thinks the answer should be okay, he's going into this one, but it looks like our fast improvement eleven s worth, right? I'd say the New York is better, but mathematically it actually isn't, not not at the 5% level of significance anyway. Okay. Method two uses Crystal regions. Essentially, what they do is they work backwards so they find what value is below 5% and then they go, okay, that value, apparently it's 30, so you need 13 or more successes to fall into that critical region. And then they compare 13 to the eleven and then do that conclusion the same. So after we've looked at crirosions, we can do it that way if we want, if you prefer that way. However, I think it's more long winded to get to that. So I don't think it's worth the hassle personally. Yeah. Look, there are some questions though that we'll ask you to find critical regions. So we need to know it, but I wouldn't use as a default method. Okay. And then that that's it to hypothesis test sted for one child really like that's that's a very very standard question where they they give you it's up. So in an exam it will always be in the context of a question. I know we add some up here. Where where's it going? Like question two where there's no there's no context, they don't really do that in an actual exam, therealways be some scenario they throw at you. So that is actually that could be in an as applied paper, okay? Which is why there's only the one example because they're all like that. If you ever read of any of them, I need the last, actually the first you actually look, I'm gonna just scroll down to my lego. So all of these ones, the first six are all no context. So beyond that, the seven omwards are the ones you're going to see an actual exam. But personally, I don't think the context makes it that much harder. Finding the alternative hypothesis is always nasty. And the conclusion is the bit that historically our found students struggle with and they struggle to understand what so so most people that I've tutored when they get to hypothesis, test, question or taught, they're quite good at getting the first mark. The you know it's a learned skill. You can identify the word for the alternative hypothesis. Not too bad. Mark two is actual maths. This one varies. It all depends on how good students are with by omdistribution. So if you find that you not getting that mark, it's because your bonomdistribution is not as good as it could be. Then the conclusion mark is where lots of students then just dropped off, loads of students will get one or two and then just move on. Because to get the conclusion Marks, you've got to actually understand how a hypothesis test works and the reason behind it, which is why we spent ages or last lesson at the start of this lesson, just talking through the process. So if you don't understand that process, the last two Marks are really, really hard to get. And it's all about that conclusion and particularly the put into the context. This last line is the bit that people just, they just can't do. Sometimes they go, Oh Yeah, it's over the significance level, insufficient evidence rejectation, and they get that. And then they either forget or don't understand us. But it's the context. Okay, so we should we do a few of these or a couple of these? Yeah sure. Yeah which one's taking you fancy? Three to ten because it had a straight of ten Yeah. There we go. Polland organization claims to support particular candidate, 35% candidate will pledge to support local charities if elected, call an organisation, think that the level of support will go up as a result. Takes a new poll of 50 voters. Okay, so tastastic null hypothesis or native hypothesis? So what they. The test statistic is that. But more than. Yeah more than 35% of them was sorry. So the not the tastastic. It's always the number of successes. So what what are they classisted as as a success in this? But there is support for a particular country. Brilliant. Yeah. So test statistic is the number of people who support candidate. Okay? So there's no there's no actual numbers involved, but it's just the number of people who do the thing we're interested in. Okay, then support the kind yes, so number of people who support the candidate. Oh, I know why I always get suffconfused for the first. It's weird on that, like. So annoyingly, sometimes this isn't worth a mark. Sometimes it's sort of lumped in with the like the hypothesis mark. It's always good to write. Because it's so it's mathematically correct. You need it there. You need to find what x is and you need to find p, otherwise your question doesn't actually make any sense. Yeah brilliant testistabecause that and we call that x normally. And then what about the null and alternative hypothesis? Yeah brilliant. And then H1. Yeah brilliant. Okay. The word you got from that is they reckon it support would go on. So Yeah you don't get much to go on but well then you've identified it. Part b is then do the hypothesis test so using the 5% level of significance. Oh, they want the critical region and only but we'll change the question. We're not going to find the critical region. We're going to we're going to do the we'll just do the hypothesis test because we haven't done critical regions yet. We'll go with we'll look ahead at c and they said 28. So we'll assume 28 people and we'll do the hypothesis test on that. Good yet 28 or more extreme. Wait, don't write x is. P would be that Oh sorry, or this thing I'd just put equals here actually. No looking for me table. What we got 50. Do I agree with your number? Yeah, I don't want to. Yeah. Okay. Yes, that's less than 5%. So what? Sufficient evidence? Good. Yeah Yeah. Move that. What's it called be. So for the first mark, I would always talk in terms of H1 or H naught to either say there's sufficient evidence to accept H zero or there's sufficient evidence to reject sufficient evidence to reject H zero ught or support H1. Okay. So either accept or reject depending on which doesn't really matter which one you go for. Yeah, if that worked. And then for that last bit, then put it into the context of the question. So what does that actually mean in terms of the polling organization? Developing. More support for the country. Yes. And there's evidence to suggest the supporgot up here. Good Yeah brilliant in the textbook there Marks for the conclusion they've got there is evidence that the candidate's level of popularity has increased. If you're ever struggling for that, put it in the context use the language that they use in the question. Yeah so just just look at what they've gone and go okay, so there is evidence to suggest the level of sport will go up or has gone up okay. So that again Yeah that I don't we the one question that wanted on a critical agent but that's fine we can amend that but Yeah that again pretty standard question they don't break it down like this though in an actual question they usually just throw the whole thing at you in one go okay but you still do this put this step so like what we did even if it just said test at the 5% level of significance, the 28 people that youstill do this exact thing anyway. Yeah. Okay. So nothing really changes here. The next bit is two tel e tests. So so far we've only looked at one tel tests. So it's always been H1. It's always been a gross than or a less than Yeah two tab test. What happens is, so let's jump straight into the example and then we'll come back to this bit here. What happens is the person who has a new hypothesis won't say they think it's gone up or down theysay in this question. It's that word there. Okay. So they're not sure which way it is. Okay, so so who's doing is it manuual or is it I don't rather full question. Over a long period of time, we found that Enrica's restaurant ratio of none veggie to veggies, two to one manuels restaurant, random sample of ten people orin one older veggie, five set level certiance test whether or not okay, so no one in particular is, but if they're not saying whether they think it's gone up or it's gone down, they say it's different. It's what we call a two titest. What happens is H1 just has a not equal to. Okay, so that's the first change thatbe the most obvious one. And then the second thing that changes is the significance levels in the question. They've set a 5% level of significance, but because it's a two taltest, each tail gets half of the significance level. Oh, what don't I do that? So when they're comparing the answer they get with the significance level, they've halved it, okay, as this behears us, so instead of it being 0.05, they've have half 0.05. Okay, and that is the only difference. Okay, so there's two differences. Your alternative hypothesis will have a not equal to and you half the significance level. That's it. Everything else stays exactly the same. The thing is. Cool. But then how do we know the. Like. Like let's say normally after the x bit like after the H zero bit, then we'll do it right for H1. So it's always so p and the number don't change. It's just a not equal to sort of a less than a row greater than does literally just cross an equal le sign up. But then what do we put in the calculatoh? I see it mean it mean so map doesn't change actually. So for this question, I keep scrolling too much. There we go. So this question, x is the number of people in the sample who order veg. So it's the number of successes in our sample. P ys, the probability that a randomly chosen person orders a veggie male significlevel and all that assume H note is so we get this spunolar distributed. We've got ten people because there's ten of them with the probability of success of one third. The thing we work out, again, it's always the chance of the thing that's observed, which was one or more extreme. Okay, in this case more extreme of one is less than or equal to one because we want to go away from the expected value. If we're expecting a third of people to order veg then three and a bit people out of ten should order of vegie. So we're going away from that three. Okay? The reason we half the significance level is because until we get that observed value, we don't know which side of the apopothesis test we're looking at. So they've gone. I think it's different. I don't know which way I think it's different. We're testing the 5% level of significance. So two and a half percent covers whether it's decreased and 2.5% covers whether it's increased. And then based on the sample, tells us which end we're actually looking at. Okay. So in the sample, only one person orveggie, because that's on the lower end. We only look at the lower end, where there's a 2.5% level of significance. If instead of one person, let's say it was eight, people would then look at the upper end and we go, okay, well, it's still 2.5%, but we're looking at the upper tail. So wehave greater than or equal to eight is that thing we were trying to find. Okay, so this bit doesn't actually change. It's always the value observed or more extreme. Okay, this this looks dodgy. Don't don't do that. Just type in new calcua. Don't do that. That's duschool. We would just do jump straight to that and I calcula okay, and then your conclusion will be a bit different. You always say it's decreased, you'll just say it's different, same as they did. Okay. I want to now take it back to this bill pijust, so we could sort of look at the theory a bit more. So one tail test is used when the claim says it's gone up or gone down. Two tais used when they don't state a direction of change or they're not sure. So sometimes itbe, phrased as manuual, thinks it is different, but it's not sure how it's changed. Okay, therebe some sort of uncertainty or theyjust go different. And there Yeah this bit here. So we need to know which tail of the distribution we're testing. You don't actually test both of them. You're only end up testing in one. So we're looking at the expected outcomes. We looked at this already actually where we take the number of trials times about the probability with the simpler questions. You were doing this just automatically. So we looked at the coin flip one and we had eight tosses. You went well, four heads because it's half of them. Okay? So the actual mathematical way you get that expected outcome, the number of trials, tiprobability. So in in ricose case, we've got ten people in our sample, but we didn't talk about that actually that they they gave the probrobertive success is a ratio because that's where the third came from because it's one out of the three. Okay? So wedo ten times third get 3.3 recurring. So that's our expected. I obviously kind of have 0.3 of a person order dering a veggie meal, but thin a bit and then we're always testing. Or more so I like to say all more extreme that they've spted out in a bit longer ways. As they said, if the abserved d value x is lower than this expected outcome, you test x is less than or equal to x. If it's higher than the expected value, then we test greater than equal to that value. Okay. It's normally, as they say, Hey, it's normally obvious which way you should test. You don't often get one that's really, really close to that expected value. Like if the expected outcome is 3.3, they probably won't give you three or four. If it's if it's one that's less, it's going to be one. If it's one or if it's one that's higher, you're looking at easily six or above. So it's usually quite obvious which way to go, okay, as say that's all the changes in the two totest. Okay. So your null no, not your null. Your alternative hypothesis will have a not equal to in it. Everything else stays the same. So so the actual maths doesn't change here. I've rupped it all out. I've croit all out. Let's actually pretend to see it. So the actual maths stays the same. And then when you compare with each significance level, you half the one in the question. Okay, they're the only two changes. Conclusion still the same. I suppose you can say the conclusion as the word different n, but you just use the language that they're using in the question. Yeah method two again, they've done it with critical regions. So with two totest, unless they ask you to, I think this is really silly to do because they had to find two critical regions. So there go that they've said that you have to be at zero or seven are above to be critical and then go ark. So one is not in the critical ystal region. So Yeah, but again, I wouldn't do this method unless they specifically tell you you have to. Okay, okay. And then again, it's just more questions like that. The first I use the first five. Yeah there's no no context to them. So well, well, they're not bad. I personally wouldn't I wouldn't even bother looking at them even even if you were looking to do loads of questions. I would jump straight at six and do six to ten because they're the ones that are more exam based and more realistic to what you're going to see. Okay, one. Yeah. Which one? Which one? You can't say. Do you have scroll? You pick one and we'll try and pick one. This time we'll look ahead at the question. We'll pick one that doesn't mention a critical region. So wehave to do it that way. We can just into the last one Yeah with your australito. How let me minimize that. Okay. So question term, stblood test is able to didise a particular disease with probability point 96. Manufacturers suggest that a cheap test will have the same probability of success. Conducts a clinical trial on 75 patients. The new test correctly diagnosed at 63 of these patients. Test the manufacturer's claim at 10% level to state ating your hypotheses clearly get. Over there first state the state the test statistic. Good idea x so what is x? So you can just call that x. Yeah, it is the testistatistic, but you can just call x just to minimize how much you've got to write. Number of successes made by the cheaper test? Yes. The number of patients correctly diagnosed? Yes. The number of successes. Good. By the cheaper test you could start made but Yeah Yeah made by the cheaper test because that's what we're testing. Lovely. What's. That's define pay as well. I was grfully for it. But what actually is it? It's the probability of what happened. Zero point so not not what numerical value is it what what actually is it measuring it's the probability of what happening Oh sorry of. In making a right diagnosis. Yeah, exactly. Here p is the probability of a successful diagnosis. So not said not Yeah but it feels weird this one because it's really, really high probability. Yeah and then alternative. Just p not equal to Yeah so it should have been obvious it was not equal to because we're in the two section. However, the language in the question is it's just the fact that they think it should have the same that tells you it's it's that so you are right. You know you are right. It is that but it should like you should have known it was that based on the fact that all of it, like we're in the two section, select that they're all detailed iled. So in an actual exam, you've got to identify that when it says they think the cheaper test will have the same probability, that means this in general, if you can't identify them saying, I think it's gone up or I think it's gone down, it will be a. Okay, now what? Yeah, good. So we say smaller equal Yeah 60Yeah that's to 60. Yeah relax. We want more extreme. You want to include the 63. You need to include the value in the questions. You need to include 63. Not that I don't imagine it makes much difference to the question, but Yeah, there you go and then you work that out. But the thing is, what if I include 63? Wouldn't that mean it's the same probability I 62? So less than to 63 and less than a to 62 will be slightly upfront. There won't be a massive difference, but there will be a little difference. But the thing is if I include 63, doesn't that mean that it has a chance of it being the same because it might be 63? So they've tested so in their sample they had 63 so we always test the thing they got in the sample or more extreme. Well, okay, so need you need the 63 as well because that's the thing they measured it to be. I don't think my table goes up that high, so I might not be able to match you with a value here. No, my table stops at n is 50. Oh, I got the answers somewhere. I have got the answers. That's fine. I can just cheat and do it that way. Wait. I think my calculator isn't doing that as well. Not doing that. What's it saying? The full screen here while you entered is not allowed in the function or command. What did you type? They type. You should try again because your calculator should do it because it's one of the graphical the normal calculator should do it. So you should do it as well. Sorry I forgot to put the 0.96I just put okay, okay. So you put 96. Yeah and that's why I didn't like it. It's a really, really small value ley though. That's tidy. Yeah. Did you get 0.0000417? Yeah, I just got it in like that's fine. Yeah, put it on the phone. That's fine. Yeah, that works. Yeah, tiny. Okay. And then what significance level do we test that against? Zero point nearly it's too. Wait, sorry, can I scroll back? Oh sorry, I'm 0.005 good. Yeah half the so it was 10% but half it because it's too. Good to it's below that. So what? To support Oh, sorry, I know I. There is sufficient evidence to support. To support that. It will have the same. It will Yeah, it will have the same probability of success, but are we supporting the null hypothesis or the aunative hypothesis? So you want you want your conclusion to be two step. So the first step is, is it for or against the old system? So is there sufficient so you said sufficient evidence to support. So which one does this support? Does this value support H1 or H zero? So. H1Yeah because if H zero is this is the chance of it happening under H zero, which is basically zero. That's as low as it can get really. So there's evidence for H1. Good. So that gets you your one on your Marks. So to get your last mark, put it into the context of the question. So what does that tell us about the new test? New test does not have a probability of 0.96 exactly that. Yeah. So the new test does not have the same probability of success as the old test. Or there's technically, there's evidence suggeit doesn't have the same. However, at that sort of probability, we can all but guarantee it. Yeah, brilliant. To it. There you go. That then is the end of hypothesis test. And other than the critical region bit, I shouldn't quite get onto our critical region bit. It's an alternative method. But also sometimes the question will ask you to find a critical region. So it is worth looking at. And if we had time now we're look at it now, but we don't unfortunately. I spoke to sort from the tutoring company yesterday about rearranging. I've not there, haven't given me a day yet. I don't know Denif I said anything to you. I know, I know in the original thing, I don't think you're back at school till next week. I Yeah because the asif is okay on the tenth and eleven, but I'm traveling the tenth to Beijing and on the eleven theybe on the play. So I if we can do it earlier, like maybe before the tenth, which of all that is I'm back at school Monday and then you're, I think you're eight hours ahead of me, so I can't do it before school and after school. The earliest I cle do is like 4:00, which for you is like midnight. How about after eleven the weekend after eleven? Because so with the weekends, I can only really do as in a different weekend. Yeah a different weekend potentially. I'll have to look at what my calendar is, but Yeah, I can have a look at that. Yeah. Okay. Yeah. Well, today that covered a lot of math again there. You know if you're doing this in school, pop off this test is it's it's a big topic and we've covered it effectively in like under two hours. So waldone, and hopefully I'll see you soon. Yeah, bye in a bit water.
处理时间: 28871 秒 | 字符数: 38,247
AI分析
完成
分析结果 (可编辑,支持美化与着色)
{
"header_icon": "fas fa-crown",
"course_title_en": "Language Course Summary",
"course_title_cn": "语言课程总结",
"course_subtitle_en": "A-Level Maths 1v1 Tutorial",
"course_subtitle_cn": "A-Level 数学 1对1辅导",
"course_name_en": "A level Maths Alice",
"course_name_cn": "A level 数学 Alice",
"course_topic_en": "Hypothesis Testing (Binomial Distribution)",
"course_topic_cn": "假设检验 (二项分布)",
"course_date_en": "N\/A",
"course_date_cn": "未提供",
"student_name": "Alice",
"teaching_focus_en": "Deep dive into the mechanics and structure of one-tailed and two-tailed hypothesis tests using binomial distribution, focusing on identifying hypotheses, test statistics, calculating P-values, and drawing contextual conclusions.",
"teaching_focus_cn": "深入探讨使用二项分布进行单尾和双尾假设检验的机制和结构,重点关注识别假设、检验统计量、计算P值和得出背景性结论。",
"teaching_objectives": [
{
"en": "To correctly identify the null (H0) and alternative (H1) hypotheses in word problems.",
"cn": "能够正确识别应用题中的零假设 (H0) 和备择假设 (H1)。"
},
{
"en": "To understand the difference between one-tailed and two-tailed tests and how significance levels are adjusted.",
"cn": "理解单尾检验和双尾检验的区别,以及如何调整显著性水平。"
},
{
"en": "To execute a full hypothesis test, including stating the test statistic, calculating the probability, comparing it to the significance level, and concluding in context.",
"cn": "执行完整的假设检验过程,包括陈述检验统计量、计算概率、与显著性水平比较,并给出背景性结论。"
}
],
"timeline_activities": [
{
"time": "0:00 - 12:00",
"title_en": "Review of Test Statistic and Hypotheses (Example 1 Context)",
"title_cn": "检验统计量和假设回顾 (例1情境)",
"description_en": "Clarified that the test statistic is the 'number of successes' (e.g., number of people supporting a candidate) and established H0 (p=0.4) and H1 (p<0.4) based on contextual words like 'overestimating'.",
"description_cn": "明确检验统计量是‘成功次数’(例如支持某候选人的人数),并根据‘高估’等上下文词汇确定了H0 (p=0.4) 和 H1 (p<0.4)。"
},
{
"time": "12:00 - 27:00",
"title_en": "Explaining Rejection Condition and Binomial Setup (Example 1 Part C)",
"title_cn": "解释拒绝条件和二项分布设置 (例1 C部分)",
"description_en": "Discussed the condition for rejecting H0 (P-value < significance level) and set up the binomial distribution (X ~ Bin(20, 0.4)), focusing on calculating P(X ≤ 2) as 'three or fewer' is more extreme than the expected value (8).",
"description_cn": "讨论了拒绝H0的条件 (P值 < 显著性水平),并设置了二项分布 (X ~ Bin(20, 0.4)),重点关注计算 P(X ≤ 2),因为‘三个或更少’比期望值 (8) 更极端。"
},
{
"time": "27:00 - 37:00",
"title_en": "Hypothesis Test Marking Scheme & Method Overview",
"title_cn": "假设检验评分标准与方法概述",
"description_en": "Detailed the 4-mark structure for hypothesis tests (Hypotheses, Probability calculation, Comparison, Contextual Conclusion) and contrasted P-value method (Method 1, preferred) vs. Critical Region method (Method 2).",
"description_cn": "详细阐述了假设检验的4分结构(假设、概率计算、比较、背景性结论),并对比了P值法(方法1,首选)与临界值法(方法2)。"
},
{
"time": "37:00 - 56:00",
"title_en": "Full One-Tailed Test Practice (Example 5)",
"title_cn": "完整单尾检验练习 (例5)",
"description_en": "Worked through a full test (New drug vs standard treatment, H1: p > 0.4). Emphasized that the intuitive result (11\/20 seems good) might not lead to rejection (P=0.128 > 0.05).",
"description_cn": "完成了一个完整检验 (新药与标准治疗,H1: p > 0.4)。强调直观结果(11\/20 看起来不错)可能不会导致拒绝 (P=0.128 > 0.05)。"
},
{
"time": "56:00 - 1:08:00",
"title_en": "Two-Tailed Test Introduction (Example: Restaurant Veggie Orders)",
"title_cn": "双尾检验介绍 (例:餐厅素食订单)",
"description_en": "Introduced two-tailed tests where H1 is p ≠ p0. Key changes: H1 uses '≠' and the significance level must be halved (e.g., 5% becomes 2.5% for comparison at the observed tail).",
"description_cn": "介绍了 H1 为 p ≠ p0 的双尾检验。关键变化:H1 使用‘≠’,且显著性水平必须减半(例如,5% 减半为 2.5% 用于与观察到的尾部进行比较)。"
},
{
"time": "1:08:00 - End",
"title_en": "Two-Tailed Test Practice and Conclusion Summary (Example 10)",
"title_cn": "双尾检验练习与结论总结 (例10)",
"description_en": "Applied two-tailed test logic to a blood test scenario (P=0.96 claim). Calculated a tiny P-value (0.00004) against a halved level (0.05), leading to rejection of H0 (i.e., evidence that the probability is NOT 0.96).",
"description_cn": "将双尾检验逻辑应用于血液检测场景 (P=0.96 的主张)。计算出极小的 P 值 (0.00004),与减半后的显著性水平 (0.05) 相比,拒绝了 H0(即有证据表明概率不等于 0.96)。"
}
],
"vocabulary_en": "Test statistic, Null hypothesis (H0), Alternative hypothesis (H1), One-tailed test, Two-tailed test, Significance level, Successes, Expected value, Critical region, Overestimating, Improvement, Different.",
"vocabulary_cn": "检验统计量, 零假设 (H0), 备择假设 (H1), 单尾检验, 双尾检验, 显著性水平, 成功次数, 期望值, 临界区域, 高估, 改进\/提升, 不同。",
"concepts_en": "Binomial Distribution for Hypothesis Testing, P-value comparison method, Contextual conclusion writing, Adjustment for two-tailed tests (halving alpha).",
"concepts_cn": "用于假设检验的二项分布, P值比较法, 背景性结论撰写, 双尾检验的调整(显著性水平减半)。",
"skills_practiced_en": "Identifying key parameters (n, p) from word problems, Structuring hypothesis tests, Calculating binomial probabilities (using 'more extreme' logic), Interpreting results based on significance levels, Translating statistical findings into contextual language.",
"skills_practiced_cn": "从应用题中识别关键参数 (n, p), 构建假设检验的结构, 计算二项分布概率(使用‘更极端’的逻辑), 根据显著性水平解释结果, 将统计发现转化为上下文语言。",
"teaching_resources": [
{
"en": "Textbook examples demonstrating one-tailed and two-tailed binomial hypothesis tests.",
"cn": "教科书中展示单尾和双尾二项分布假设检验的例题。"
}
],
"participation_assessment": [
{
"en": "Very engaged, demonstrated strong retention of previous concepts, and actively participated in structuring the complex steps of the hypothesis test.",
"cn": "参与度很高,展示了对先前概念的牢固掌握,并积极参与了假设检验复杂步骤的构建。"
}
],
"comprehension_assessment": [
{
"en": "Excellent grasp of the overall process. Showed clear understanding of why the alternative hypothesis direction is chosen and correctly navigated the 'more extreme' concept.",
"cn": "对整体过程有很好的把握。清楚理解了选择备择假设方向的原因,并正确处理了‘更极端’的概念。"
}
],
"oral_assessment": [
{
"en": "Clear and confident oral responses, especially when explaining the differences between one-tailed and two-tailed tests.",
"cn": "口头回答清晰且自信,尤其在解释单尾检验和双尾检验的区别时表现出色。"
}
],
"written_assessment_en": "N\/A (Focus was on conceptual discussion and structured breakdown, not formal written submission of an exercise).",
"written_assessment_cn": "未进行(重点是概念讨论和结构分解,而非正式的书面练习提交)。",
"student_strengths": [
{
"en": "Strong ability to identify contextual cues in word problems to set up the correct alternative hypothesis (e.g., 'improvement' means >).",
"cn": "很强的能力,能够从应用题中识别上下文线索来设置正确的备择假设(例如,‘提升’意味着 >)。"
},
{
"en": "Quickly grasped the requirement to halve the significance level for two-tailed tests.",
"cn": "快速理解了双尾检验需要将显著性水平减半的要求。"
},
{
"en": "Solid recall of binomial distribution calculation methods required for finding the P-value.",
"cn": "对计算P值所需的二项分布计算方法记忆牢固。"
}
],
"improvement_areas": [
{
"en": "Slight confusion when deciding whether to include the observed value (e.g., 63) in the 'or more extreme' calculation for two-tailed tests, requiring instructor clarification.",
"cn": "在双尾检验中决定是否将观测值(例如 63)包含在‘或更极端’的计算中时略有混淆,需要教师澄清。"
},
{
"en": "The final step of putting the conclusion strictly into the context of the question sometimes requires prompting, although understanding is present.",
"cn": "将结论严格置于问题背景中的最后一步有时需要提示,尽管理解是存在的。"
}
],
"teaching_effectiveness": [
{
"en": "Highly effective. The step-by-step breakdown of the marking scheme provided a clear roadmap for success in exam questions.",
"cn": "非常有效。对评分标准的循序渐进的分解为考试问题的成功提供了一个清晰的路线图。"
},
{
"en": "Using multiple examples (one-tailed vs. two-tailed) clearly illustrated the minor but crucial differences in procedure.",
"cn": "使用多个示例(单尾与双尾)清晰地展示了程序中微小但关键的差异。"
}
],
"pace_management": [
{
"en": "The pace was appropriate, slowing down significantly to detail the nuances of the two-tailed test and the rationale behind the P-value comparison.",
"cn": "节奏适中,显著放慢速度来详细说明双尾检验和P值比较的原理。"
}
],
"classroom_atmosphere_en": "Collaborative and rigorous. The teacher created an environment where the student felt comfortable questioning the counter-intuitive statistical results.",
"classroom_atmosphere_cn": "协作且严谨。教师营造了一种让学生敢于质疑反直觉的统计结果的环境。",
"objective_achievement": [
{
"en": "All primary objectives regarding hypothesis identification, test execution, and contextual conclusion were met through guided practice.",
"cn": "通过指导练习,所有关于假设识别、检验执行和背景性结论的主要目标都已达成。"
}
],
"teaching_strengths": {
"identified_strengths": [
{
"en": "Excellent breakdown of the 4-mark hypothesis testing structure, demystifying the required components for full credit.",
"cn": "对4分假设检验结构的出色分解,解开了获得满分所需的组成部分的神秘面纱。"
},
{
"en": "Clear differentiation between Method 1 (P-value) and Method 2 (Critical Region), strongly recommending the more efficient method.",
"cn": "清晰区分了方法1(P值)和方法2(临界区域),强烈推荐更有效的方法。"
}
],
"effective_methods": [
{
"en": "Using the expected value (mean) to determine the direction of 'more extreme' when setting up the binomial probability calculation.",
"cn": "利用期望值(均值)来确定设置二项分布概率计算时‘更极端’的方向。"
},
{
"en": "Explicitly showing how the significance level is halved for two-tailed tests and why.",
"cn": "明确展示双尾检验的显著性水平如何减半以及原因。"
}
],
"positive_feedback": [
{
"en": "The student demonstrates a very strong foundation for this advanced topic, absorbing the logic quickly.",
"cn": "学生对这个高级主题表现出非常坚实的基础,能迅速吸收其逻辑。"
}
]
},
"specific_suggestions": [
{
"icon": "fas fa-lightbulb",
"category_en": "Conclusion Structure",
"category_cn": "结论结构",
"suggestions": [
{
"en": "For the final mark, practice always starting the conclusion with the statistical finding (e.g., 'Since P < 0.05, there is sufficient evidence to reject H0') before translating it into context.",
"cn": "为了获得最后的分数,练习总是以统计发现(例如‘由于 P < 0.05,有充分的证据拒绝 H0’)开头,然后再将其转换为上下文。"
}
]
},
{
"icon": "fas fa-chart-line",
"category_en": "Two-Tailed Test Practice",
"category_cn": "双尾检验练习",
"suggestions": [
{
"en": "Review the rule for two-tailed tests: H1 contains '≠', and the significance level ($\\alpha$) must be divided by 2 before comparing it to the calculated P-value.",
"cn": "复习双尾检验的规则:H1 包含‘≠’,并且在与计算出的 P 值比较之前,显著性水平($\\alpha$)必须除以 2。"
}
]
}
],
"next_focus": [
{
"en": "Hypothesis Testing using Critical Regions (Method 2) to ensure readiness for exam questions that specifically demand this approach.",
"cn": "使用临界区域(方法2)的假设检验,以确保能够应对明确要求此方法的考试题目。"
}
],
"homework_resources": [
{
"en": "Complete exercises 6-10 from the textbook section, focusing only on the P-value method (Method 1) unless specified otherwise.",
"cn": "完成教科书中第6至10题的练习,除非另有说明,否则只关注 P 值法(方法1)。"
}
]
}