Development Evaluation Tool Two Tier Multiple Choice Using Wondershare Quiz Creator to Identify Mathematical Connection

The research objective is to develop an evaluation tool two tier multiple choice using the wondershare quiz creator to identify students mathematical connection capabilities. The research model uses ADDIE. The test subjects in the study were class VII-D of Islamic Qon Middle School. The research instruments used were interview sheets, validation sheets, test instruments

everyday life. Therefore, the ability to connect mathematics needs to be trained from an early age, if students are able to relate mathematical ideas to their understanding, it will last longer because they are able to see the relationship between mathematical topics, with contexts other than mathematics in everyday life. In Standard National Council of Teachers of Mathematics (NCTM, 2000) defines mathematical connection as the relationship between mathematical topics, the relationship between mathematics and other disciplines and the relationship between mathematics and the real world or in everyday life.
One of the problem models that can be used to identify mathematical connection abilities is a multiple choice test equipped with reasons or known as a diagnostic test (Susanti, 2014). Two Tier Multiple Choice (TTMC) is a diagnostic test in the form of multiple choice questions, consisting of two levels, which was first developed by David F. Treagus in 1988. The first level contains questions that will be tested on students, while The second level contains the reasons students choose the first answer. Multiple Choice Tier Two is the development of the usual multiple-choice test, dith shaped test two levels, it is expected educators can measure and know the math skills tested.
The teaching process in the classroom carried out by the teacher is still conventional. There are still many teachers who are not familiar with Information Technology which is commonly used by teachers in urban areas to support the learning process. They cannot upgrade their knowledge where they may face so many obstacles which consequently cause their knowledge to become outdated (Fauziyah & Uchtiawati, 2017). Therefore, to give students a sense of enthusiasm, evaluation tools must be designed as attractive as possible to create effective, interactive, and efficient evaluation tools. In this study, the Wondershare Quiz Creator is used, which is software designed to create an IT-based assessment program or quiz (As'ari, 2017). This is very user friendly meaning, easy to use and does not require a complicated programming language. The results of the questions that have been made can be stored inform standalone flash or can stand alone on the website (Utomo, 2015). In addition, the results of the publication of the questions that have been made can be saved inform swf files, html files, and exe files (Astika, 2017;Farman et al., 2021). With Wondershare Quiz Creator it is easier for educators to make learning evaluations because they do not require a complicated programming language and there are many variations of questions such as true or false, multiple choice, fill in the blank, matching and short essays (Farman, 2020). There are several advantages of Wondershare quiz creator software including: (1) designing questions faster, because users are not required to master the action script; (2) there are many different types of questions; (3) can be changed background, color, font, and others; (4) questions can be published online; (5) questions can be made randomly; (6) examination of the results of student answers can be done quickly; (7) the result of student answers will automatically be sent to the email educator's; (8) users can also set KKM (Minimum Completeness Criteria) for students. In addition to the advantages mentioned above, Wondershare Quiz Creator also has weaknesses including: (1) limited templates; (2) writing cannot be animated with motion; (3) allow damage to occur hardware or software; (4) wondershare quiz creator software is not suitable for use on computers/ laptops/ PC that have low specifications.
There are several studies on two tier multiple choice, one of which is research conducted by (Shidiq, 2014). The difference from this study is that the instruments used are still in conventional form, while in this study using Wondershare quiz creator software. In addition, differences are also seen in the research subjects, in this study carried out at the SMP/MTs level, while Ari Syahidul Shidiq's research was conducted at the SMA/MA level. There is also research on wondershare quiz creators, one of which is research conducted by (Munscfatra, 2017) from that study, there is an increase in the percentage of student responses by 21% from the initial response of the "interesting" criteria to "very interesting". Based on the description above, the researcher is interested in conducting research on "Development of a Evaluation Tool Two Tier Multiple Choice Using Wondershare Quiz Creator Software to Identify Mathematical Connection" B. Methodology

Type of Research
Type of research used is researchand development. In this study using the ADDIE model with the stages of Analysis, Design, Development, Implementation and Evaluation.

Research Subjects
The subjects in this study were students of class VII D of SMP Islamic Qon which were used as development trials at thestage development. At the stage of implementation researchers took JME/6.2; 133-148; December 2021 VII C class subject. Class VII was chosen because the material that will be used as research is in class VII, and is based on the recommendation of the mathematics teacher.

Research Design
At the analysis stage there are several things that researchers do such as performance analysis, student analysis, analysis of facts, concepts & procedures of learning materials, and analysis of learning objectives. After the analysis stage, the next is the design stage that it goes through which includes preparing tests, designing media, and producing draft 1. After producing draft 1, validation is carried out by media and material experts at the development stage, it is said to be valid if the assessment shows an evaluation presentation > 60% which is classified as feasible or very worthy. If it is not valid, it will be revised and produce draft 2. If it is valid then a development trial is carried out, to find out whether it is effective or not, as seen from the quality of the questions and student responses. If it is still not effective, then another trial is carried out to produce effective criteria. After obtaining a valid and effective evaluation tool, the next step is the implementation stage by giving a mathematical connection test based on Two Tier Multiple Choice, after that a student response questionnaire is given and data analysis is carried out. So as to produce a valid and effective evaluation tool.

Research Procedure
This development model was carried out on social arithmetic material which refers to the ADDIE development model. The steps include: (1) analysis, relating to the analysis of the work situation and environment so that it can be found what products need to be developed; (2) design, is a product design activity as required; (3) development, is the activity of making and testing a product; (4) implementation, is the activity of using the product; (5) evaluation, the process of evaluating products that have been made according to specifications or not (Sugiyono, 2013).

Data Collection Method
In data using methods including: (1) interviews were conducted to conduct a preliminary study to two informants, the mathematics teacher and students; (2) validation questionnaires are given to material experts and media experts, with the aim that the evaluation tool that will be tested can measure what you want to measure, and can display what you want to display. Validation by material experts was filled out by validators of mathematics education lecturers at the University of Muhammadiyah Gresik and teachers of mathematics subjects. While the validation of the material is filled out by someone who is an expert in their field, in this study the researchers chose two lecturers of informatics engineering. The data collection method is done by showing the application wondershare quiz creator along with a validation sheet to the validator for assessment; 3) the evaluation test method two tier multiple choice was developed to identify mathematical connections. This is done by sharing the application inform swf through the group WhatsApp and asking to install and work on the evaluation test questions. After students answer all the questions asked by the researcher, the results will be automatically recorded and entered into the email researcher's. (4) the questionnaire method by distributing the link bit.ly google form about the student response questionnaire to the developed evaluation tool.

Instruments
There are several instruments used to obtain data in research, including: (1) interview sheets, questions to be asked to subject teachers about the habits of educators in evaluating learning, how the form of evaluation questions are given, applications that are often used in when the implementation is online, and of course the character or attitude of the students during the evaluation. While the questions given to students include students' interest in the developed evaluation tool, how to do it if it is in the form of two tier multiple choice; (2) validation sheet, consisting of 2 types of validation, material validation and media validation. Material validation is filled out by 2 validators, mathematics subject teachers and lecturers, the total statements in the material expert validation sheet are 20 statements, which consist of 4 assessment indicators, presentation, content quality, construction, and ease of use. While, media validation consists of 20 statements, which include presentation indicators, content design, display design, and ease of use; (3) the test instrument in this study was in the form of two-level multiple choice. The questions that are used as instruments related to social arithmetic material for class VII students are adjusted to the indicators of students' mathematical connection abilities; (4) questionnaire sheets, given to respondents in the form of questions or statements that must be filled out. Students fill out a questionnaire according to their opinion about the questions being worked on, the presentation of the questions, and the appearance of the evaluation tool.

Technique of Data Analysis
Data analysis techniques were carried out quantitative and qualitative. Qualitative data comes from student interviews. Meanwhile, quantitative data analysis was carried out based on the presentation of the results of product development based on a evaluation tool two-tier multiple choice using wondershare quiz creator software and item analysis, and processed by using statistical tests. In this study, data analysis consists of several stages including:

a. Expert Validation Analysis
Researcher used as a guide in revising each component in the preparation of an evaluation tool based on two tier multiple choice. There are several steps in the analysis of media validation instruments and expert validation, including: 1) Giving a score on each validation sheet with the following criteria: Table 1. Eligibility Criteria Evaluation Tool for Validator Very Less (SK) (Sugiyono, 2013) 2) Then the results of the questionnaire will be analyzed by means of (Sudijono, 2016) Description : Percentage Raw score earned Maximum score 3) The last step is to conclude the results of the analysis by looking at the table below:

. Mathematical Connection Ability
According to (Jauhariansyah, 2014) the criteria for scoring on the two-level diagnostic test are as follows: 1) If the answer and reason are correct then the score = 1 2) If the answer is correct and the reason is wrong then the score = 0 3) If the answer is wrong and the reason correct then score = 0 4) If the answer and reason is wrong then score = 0 After being scored, the next step is to calculate the percentage score of each student using the formula: Next is to calculate the percentage of students' mathematical connection abilities using the formula: ∑ ∑ Percentage of students' mathematical connections for each item is qualified as follows: Low 0 -20, 99 Very Low After knowing the percentage of students' mathematical connections, the average percentage calculation for each item is calculated.

c. Student Responses
To determine student responses to the evaluation tool, students are given a student response questionnaire which refers to the table below: Table 4. Student Response Score Score Answer Choices 5 Strongly Agree 4 Agree 3 Don't Agree 2 Disagree 1 StronglyDisagree Next, the calculation is carried out on each item using the formula : (Sudijono, 2016) Description : Percentage Raw score earned Maximum score After obtaining the percentage of student responses to the use of learning evaluation tools, they are grouped into several criteria as follows:

d. Item Analysis
There are several kinds of item analysis that are carried out including validity testing, reliability testing, difficulty level testing, discriminating power testing, and distractor effectiveness, as follows: 1) Validity Test How to calculate item validity using the biserial correlation coefficient formula, namely: √ Description : biseral correlation coefficient mean score of subjects who answered correctly for the items sought validity mean total score standard deviation of the total score proportion of students who answered correctly proportion of students who answered incorrectly (Arikunto, 2013) If than the instrument is valid with . The calculation of the validity test will use SPSS 16.

2) Reliability Test
In this study, the reliability test uses the K-R 20 formula, namely: Description : overall test reliability proportion of subjects who answered each question correctly proportion of subjects who answered each question incorrectly ∑ number of multiplication results between and standard deviation from the test If then the instrument is reliable. With the tabe correlation coefficient It can be calculated using SPPSS 16. In addition, the reliability of each item can be said to be reliable, when viewed from cronbach's alpha if item deleted value cronbach's alpha ( )

3) Level of Difficulty
Numbers that show difficult or easy called the index of difficulty. To find out the level of difficulty of each item, it can be calculated using the formula: (Arikunto, 2013) Description : Difficulty index Number of students who answered correctly The number of all students After the results are obtained, the next step is to categorize them in the index of difficulty criteria as shown in table 6

) Distinguishing Power Test
Distinguishing power is used to improve the quality of each item. To measure the distinguishing power of each item, the formula: (Arikunto, 2013) Description : distinguishing power the number of participants in the upper group who answered correctly the number of participants in the lower group who answered correctly the number of participants in the upper group the number of participants in the lower group. After calculating using the above formula, the results of the distinguishing power will be obtained, and interpreted using the criteria in table 7.  (Hamzah, 2014)

5) Effectiveness of Distractors
Effectiveness of distractors is an answer option but is not an answer key. To calculate the distractor index, the following formula can be used:  (Arifin, 2013) If all students answer all correctly, then which means the distractor does not work. The distractor is said to be good if it is chosen by at least 5% of all students k.

Successful Development of Two Tier Multiple Choice Based TTMC
Evaluation tool is said to be feasible if it can meet 2 criteria, namely valid and effective. Here's an indicator of the validity and effectiveness can be seen in Table 9 : the question is based on the item indicator The evaluation tool two tier multiple choice using the wondershare quiz creator can be said to be good, judging from the analysis of each item. Because grain analysis indicators can be seen in Table 10:

Effectiveness of Distractors
Distractor selected minimum 5% From alal students.

C. Findings and Discussion
This development research refers to the ADDIE model, which is the model developed by Dick and Carry in 1996. The stages are Analysis, Design, Development, Implementation and Evaluation. These stages are as follows: : 1. Analyze This stage begins with a performance analysis by conducting interviews with subject teachers about the problems faced in the implementation oflearning online. Interviews were conducted face-to-face with the mathematics teacher at the Qon Islamic Middle School. From the results of interviews and observations to schools, a basic problem was found, namely, an evaluation tool was needed that could identify students' mathematical connection abilities, which could be used inlearning online.
Next, analyze students with the aim of finding a suitable evaluation tool design, information obtained through interviews with 4 students online via video conference Google Meet with interview subjects taken randomly. From the results of interviews, students have never been given an evaluation of mathematics learning based on two tier multiple choice. Students want an evaluation tool that is interesting, interactive and not boring, which is equipped with pictures and audio.. Then an analysis of facts, concepts, principles and procedures is carried out to obtain relevant material in developing evaluation tools. The research was conducted in class VII even semester on social arithmetic material. In addition, the researcher also compiled the concept of the material used in the test, namely social arithmetic and the concept of application features in the evaluation tool.
The last step in the analysis stage is the analysis of learning objectives. Based on the results of the analysis and discussion with the mathematics teacher, the purpose of conducting a-based evaluation two-tier multiple choice using Wondershare quiz creator software is to identify students' mathematical connection abilities. Meanwhile, the mathematical connection indicators that will be used as a reference in preparing the test item instruments are as follows: (1) Connections among topics with in mathematics; (2) connections with other disciplines (Connection among topics across lessons); (3) Connection with the real world or knowledge of everyday life.

Design
At this stage, the first thing to do is the preparation of tests, which are based on indicators of connection ability, KI and KD. The test questions consist of 20 multiple choice questions with details of 10 questions at the first level, and 10 questions at the second level, with a duration of 90 minutes. Then the step of designing the media begins to design the features of the evaluationthat will be developed which consists of a login form, tool instructions for use and the final results of the evaluation tool when used. The following is a display of the evaluation tool used:

Picture 1. Login Form on the Evaluation Tool
In the login form there are several questions that must be filled out by students, such as full name, absentee number and class. There are several buttons on the login form such as thebutton start to start the quiz, and there are 3 buttons on the top right which include author information, sound, and print. 142 JME/6.2; 133-148; December 2021

Picture 2. Instructions for Use of Evaluation Tools
In Picture 2 there is some information such as instructions for working on two-tier multiple choice, the questionstotal number of questions given is 20 questions with two levels, so a total score of 100 if you can answer all the questions correctly. In addition, there is a passing rate and a passing score, meaning that the minimum completeness limit is 70 and the working time is 1 hour 30 minutes.

Picture 3. Final Display of the Evaluation Tool
There is some information obtained in Picture 3 such as the number of questions, the score obtained, the passing rate, the passing score, the score obtained and the length of time students are working on. In addition, there is abutton review that serves to see the right and wrong questions after working. The last step is the initial design, in this stepis produced thedraft initialof the learning evaluation tool. In making a-based evaluation tool two-tier multiple choice using the Wondershare Quiz Creator, the first step is to make evaluation test questions in Microsoft Word 2013. After the questions and the required pictures have been created, the researchers enter them in the Wondershare Quiz Creator.

Picture 4. Making an Evaluation Tool Using WQC
In making the background design, the researcher did not make it, because the background was already provided in the application. The display of the questions at the first level can be seen in Picture 5: Picture 5. Display of the First Level Questions for the TTMC Evaluation Tool If you have finished choosing the answers at the first level, to move to the next question, students can click the button next.. The following view of the second level questions can be seen in Picture 6: Picture 6. Display of the Second Level Questions TTMC Evaluation Tool This evaluation tool cannot be opened if you do not use aweb browser supporting, because this file storage type is informat swf. In running the evaluation tool developed, students must download firsttheapplication puf in web browser in the play store as shown in Picture 7: Picture 7. Icons Web Browser Supporting 3. Development At this stage, expert validation is carried out on the evaluation tool developed. Expert validation consists of media experts and material experts. The name of the validator can be seen in table 1: Table 11. Name of the expert validator  Very Eligible Very Eligible From the results of the validation of the experts presented in table 2 above, it can be seen that the-based mathematics learning evaluation tool two-tier multiple choice using the wondershare quiz creator meets the valid criteria. This is because the results of the validation of media experts and material experts get a percentage of , namely 94% and 80% . Furthermore, a development trial was carried out to determine the quality of the items in the evaluation developed. There are several linkages between the answer criteria chosen by the students, so that the researcher determines the number of correct answers to each question selected by the students. If students can answer correctly at the first level and answer correctly at the second level, they will get a score of 1. However, if the first level is correct and the second level is wrong, or vice versa then the score is 0. If students answer incorrectly at the first level and answer incorrectly at the first level the second level, the score obtained is 0. This can be seen in Picture 8: Picture 8. Students' Answers to each item on the evaluation test After knowing the students' answers at the first and second levels, the researcher gave a questionnaire for the students' responses to the evaluation tool used. di develop. The results of the student responses are in the appropriate criteria and can be seen in Ease 84% Very Acttractive Average Percentage 82% Interesting In addition, the researcher also analyzed the items in the evaluation test two-tier multiple choice when viewed from the validity, reliability, level of difficulty, discriminatory power and effectiveness of distractors. Where the results of the quality of the items are: a. Item Validity Question In the validity test, each question is said to be valid if the validity value is Based on the results of the validity test using IBM SPSS Stastistic 22, it was found that 1 item was invalid at number 10 b. Reliability Quality of the item is said to be good if it has a high degree of reliability( ). Based on the reliability test, it was obtained that then the question is included in the reliable criteria. Reliability test can also be seen on each item if cronbach's alpha if item deleted < cronbach's alpha value . Based on the reliability test on each item, it was found that 1 question was not reliable in item number 2. So that the questions that were used as the implementation stage only amounted to 8 questions. 1  25  2  2  3  3  23  1  1  3  5  21  2  5 3 20 19  4  2  5  16  2  6  6  24  4  2  1  3  24  2  1 Alternatif Jawaban 22  4  2  2  21  4  3  3  5  22  1  2  9 3 2 17  1  25  2  2  3  3  23  1  1  3  5  21  2  5 3 20 c. Difficulty Level In testing the level of difficulty, a good quality item produces a difficulty level of 0.31 0.70 or in the medium category. Based on the analysis of the difficulty level, it was found that the percentage of difficulty level was 70% said to be moderate, 30% was said to be easy, and there were no difficult questions. d. Distinguishing Power Quality of the items must have a minimum distinguishing power of sufficient criteria or ( ). Based on the results of the discriminatory test carried out, it was found that the percentage of sol grains that had was 100% and there were no questions that had e. Distractor Effectiveness Distractor can function properly if a minimum ofselected 5% of all students are. Based on the results of the distractor effectiveness analysis, there are 11 distractors in the first level that are not functioning properly. While in the second level there are 6 distractors that do not function properly. Meanwhile, for each criterion, the percentage of distractors is very good at 35%, good criteria is 30%, quite good criteria is 26.7%, bad criteria is 5%, and very bad criteria is 3.3%.

Implementation
At this stage the researchers mengimplemntasikan evaluation tools VII C class with 28 learners. In qualifying the ability of mathematical connections on each item, the researcher groups each item in three criteria. BB answers mean that students can answer correctly at the first and second levels, BS/SB means that students can answer correctly at the first level and answer incorrectly. at the second level or vice versa. While SS means that students answer incorrectly at the first and second levels. For more details, it can be seen in 12,50 There are 224 frequencies spread across each answer criteria (BB, BS/SB, SS). On the BB criteria, 161 out of 224 were found spread across all answer criteria, so that an average percentage of 71.88% was found. In the BS/SB criteria, there are 35 out of 224 spread across all the answer criteria, so the average percentage is 15.63%. While the SS answer criteria amounted to 28 out of 224 spread across all answer criteria, so that an average of 12.50% was obtained. In addition, from the table above, information is obtained that the ability to connect mathematics is very high in questions number 1 and 2, the ability to connect mathematics is high in questions number 3,4,5,8 and the ability to connect mathematics is sufficient in questions number 6 and 7.
If the mathematical connection ability of each student is identified, it is found that the number of students who have very high mathematical connection abilities are 11 students, the high category is 13 students, the sufficient category is 1 student, the low category is 1 student and the very low category is 2 participants.

Evaluation
In this stage, a review of the evaluation tools are developed. The results of the evaluation are used to provide feedback on the product being developed. The research carried out is in accordance with the ADDIE development model procedure, which begins with the analysis stage and ends with the evaluation stage. In practice, there are several obstacles experienced by researchers. Like students who have difficulty downloading theapplication puf in web browser, due to signals or not enough space in their storage memory. But it does not reduce the enthusiasm of students in working, to get the best results.

D. Conclusion
From the research on the development of an evaluation tool for learning mathematics based on two tier multiple choice, it can be concluded that 1) Development of an evaluation tool for learning mathematics based on two tier multiple choice using the wondershare quiz creator was developed with the ADDIE stage, namely analysis (related to situation analysis activities). environment), design (product design as needed), development (product manufacturing and testing activities), implementation (product use activities) and evaluation (product assessment process). The evaluation tool for learning mathematics based on two tier multiple choice is said to be feasible because it has met 2 criteria, namely valid and effective. The valid criteria are based on the percentage of media experts' assessment of 94% (very feasible), and material experts of 80% (adequate). Effective criteria can be seen from the responses of students and the quality of the items. From the results of student responses obtained a percentage of 82% (very interesting). While the quality of the items, based on the validity test, there is 1 question that is not valid. The reliability test has 1 question that is not reliable, with a reliability value of 0.790. The difficulty level test consists of 7 questions in the medium category, 3 questions in the easy category, and there is no difficult question category. The discriminatory power test contained 2 questions that were included in the very good criteria, 6 questions in the good criteria, 2 questions in the sufficient criteria, and there was no discriminatory test in the bad category. While the effectiveness of distractors, there are 11 distractors at the first level and 6 distractors at the second level, which are chosen by of all students.
2) The ability of students' mathematical connections identified through a-based evaluation tool two-tier multiple choice using the wondershare quiz creator, the results showed that, there were 11 students who had very high mathematical connection abilities with a percentage of 39.3%, 13 students who had high mathematical connection skills. high connection with a percentage of 46.4%, 1 student has sufficient connection ability with a percentage of 3.6%, 1 student in a low criterion with a percentage of 3.6%, and 2 students in a very low criterion with a percentage of 7.1% .