• Our Mission

A Framework for Lesson Planning

Using learning intentions and success criteria can help teachers ensure that their activities align with what they want students to know.

Two teachers collaborating on a lesson in the school library

As an instructional coach, I collaborate with nearly 65 teachers at an urban high school. My goal is to support teachers of many subjects in embedding literacy in their lessons without disrupting their classroom objectives.

I often work with our novice teachers and student teachers by reviewing their lesson plans and recommending literacy skills that reinforce their learning intentions and success criteria, which are defined by Douglas Fisher and Nancy Frey as “what you want students to know and be able to do by the end of one or more lessons.” Without learning intentions and success criteria, they write, “lessons wander and students become confused and frustrated.”

When I ask new teachers to tell me the purpose of their lessons, they often describe the activities they’ve created. For example, recently I was collaborating with a student teacher who was eager to teach the Bill of Rights to her freshmen. She began our conversation by explaining that she was going to read real-life scenarios with differing perspectives and ask students to move to the front of the classroom if they agreed with a particular scenario or to the back of the classroom if they disagreed. Afterward, she would ask students to explain their decisions.

Her excitement was palpable. She showed me the scenarios she had written, the “Agree” and “Disagree” signs she had created, and the worksheet she had designed so that students could brainstorm their own Bill of Rights.

When she finished, I commended her on the work she had done. Clearly, she had thought about the activity in detail. Next I asked her about the point of the lesson—what she wanted students to get out of it.

What she wanted—for her students to know what the Bill of Rights is, where to find it, why it’s important, and why we still need it today—was not actually conveyed by the activity. She hadn’t written a learning intention and the accompanying success criteria yet because she had been so excited to refine her activity. Without them, however, all she had was an activity—one that was not aligned with her goals for the day.

Learning Intentions and Success Criteria

Crafting a quality learning intention takes planning. Often, teachers will use an activity as their learning intention—but a learning intention goes beyond an activity. It focuses on the goal of the learning—the thing we want our students to know and do. The learning intention helps students stay focused and involved.

It’s important to create the learning intention first, and then determine the success criteria that students can use to assess their understanding—and then create the activity and some open-ended questions that help students learn.

When I was working with the teacher on her Bill of Rights lesson, we took a step back to develop the learning intention and its success criteria. The learning intention was this: “I can explain the Bill of Rights, its purpose, and its relevance to my life.” The success criteria were built around students’ ability to annotate and paraphrase the Bill of Rights, and to explain its importance, both in general and in their own lives. Annotating, paraphrasing, and analyzing are skills that are based on ACT College and Career Readiness Standards and Common Core State Standards, and they could be seamlessly incorporated into the lesson with minimal effort.

Learning intentions and success criteria are valuable across all subjects. In algebra, for example, a learning intention might be “I can understand the structure of a coordinate grid and relate the procedure of plotting points in quadrants to the structure of a coordinate grid.” The success criteria for this intention could be that students can talk and write about that procedure, using the correct vocabulary; that they can plot and label points in each quadrant on a coordinate grid; and that they can create a rule about coordinates for each quadrant.

In environmental science, if the learning intention is “I can recognize the history, interactions, and trends of climate change,” the success criteria could be that students are able to locate credible research about the history of climate change and share their research with their peers, that they can demonstrate the interactions of climate change and explain the value of those interactions, and that they can show the trends of climate change utilizing a graph and explain the value of the trends.

A Way to Focus Lesson Planning

Although engaging students in their learning is certainly necessary, the student teacher I was working with became acutely aware of the value of the skills she was attempting to help students develop and why those skills—not the activity—should drive instruction.

During her next class, she posted the learning intention and success criteria where students could readily see them. Next, she asked her students to paraphrase the success criteria, making sure they understood what they were about to do. She referred to the learning intention and success criteria several times throughout the lesson so that students could determine their own level of understanding and, if necessary, decide which skills they understood and which ones still needed support. She followed up with an exit ticket, asking students what they had learned in the lesson, how they learned it, and why learning it was important.

No matter what subject you teach, as you plan your instruction, ask yourself these questions:

  • What do you want your students to know? Why is that important?
  • Can they learn this information another way? How?

Only once you’ve thought though your answers should you begin writing your learning intention and success criteria. Keep the activities you’ve created—but don’t make them the center of the lesson or the goal of the lesson. Spend your time designing a learning intention and success criteria that will support your students’ learning and skills that they can apply to all facets of their academic life.

The English Classroom

A GUIDE FOR PRESERVICE AND GRADUATE TEACHERS

Learning Intentions

The situation.

Students need to understand the overall purpose of learning.

The Solution

Students al ways need to understand why the learning is important.

Learning Intention

A learning intention is a statement that summarises what a student should know , understand or be able to do at the end of a lesson or series of lessons. The purpose of a learning intention is to ensure that students understand the direction and purpose of the lesson. These statements are presented at the start of a lesson (Something we call ‘visible learning’) and should be discussed with students throughout the lesson, or when necessary. They are used to summarise the learning; that is, we return to the learning intentions to evaluate whether students understand what they explored.

learning intention for essay writing

As mentioned, learning intentions are written with the stem: know , understand or be able to . For example:

  • Know the definition of a metaphor.
  • Understand how metaphors are used to create imagery in a text.
  • Be able to identify three metaphors in a short poem.

As you can see, the stems a associated with levels of thinking . When sequencing your lessons, consider whether the skills students are learning are low order, such as know a definition, or higher-order, such as applying this knowledge to a text.

How to Present Learning Intention

The learning intentions should be presented on the whiteboard and/or powerpoint, depending on the context of your classroom. Unpack the language in the statement. If a word is unfamiliar to a student, tell them why. For example, know the definition of a metaphor. This might be the first time students have come across the word metaphor. That is okay. Reinforce that you will be looking at this new vocabulary at the beginning of the lesson and they can practice understand the term later.

The language needs to be clear and direct, much like a SMART goal. Furthermore, it needs to be something that they can achieve based on their current skill level. The skill itself cannot be so complicated that they cannot obtain a rudimentary understanding from their prior knowledge.

When introducing the learning intention, quiz students on what they know. You will be surprised at what information that they may come across or even how accurate their guesses can be. Use this prior knowledge to help guide their understanding of what they will learn.

Stop and reflect

After each activity, stop and pauses. Ask students whether they feel they’re working towards the learning intention successfully. This will help you understand whether they are on the right track or you need to adjust your teaching.

Share this:

' src=

Published by The English Classroom

View all posts by The English Classroom

3 thoughts on “ Learning Intentions ”

  • Pingback: Success Criteria | Cosmik Egg
  • Pingback: Designing Learning Programs | The English Classroom
  • Pingback: Teaching…Propaganda | The English Classroom

Leave a comment Cancel reply

' src=

  • Already have a WordPress.com account? Log in now.
  • Subscribe Subscribed
  • Copy shortlink
  • Report this content
  • View post in Reader
  • Manage subscriptions
  • Collapse this bar

A Step-by-Step Plan for Teaching Narrative Writing

July 29, 2018

' src=

Can't find what you are looking for? Contact Us

learning intention for essay writing

Listen to this post as a podcast:

Sponsored by Peergrade and Microsoft Class Notebook

This post contains Amazon Affiliate links. When you make a purchase through these links, Cult of Pedagogy gets a small percentage of the sale at no extra cost to you.

“Those who tell the stories rule the world.”  This proverb, attributed to the Hopi Indians, is one I wish I’d known a long time ago, because I would have used it when teaching my students the craft of storytelling. With a well-told story we can help a person see things in an entirely new way. We can forge new relationships and strengthen the ones we already have. We can change a law, inspire a movement, make people care fiercely about things they’d never given a passing thought.

But when we study storytelling with our students, we forget all that. Or at least I did. When my students asked why we read novels and stories, and why we wrote personal narratives and fiction, my defense was pretty lame: I probably said something about the importance of having a shared body of knowledge, or about the enjoyment of losing yourself in a book, or about the benefits of having writing skills in general.

I forgot to talk about the  power of story. I didn’t bother to tell them that the ability to tell a captivating story is one of the things that makes human beings extraordinary. It’s how we connect to each other. It’s something to celebrate, to study, to perfect. If we’re going to talk about how to teach students to write stories, we should start by thinking about why we tell stories at all . If we can pass that on to our students, then we will be going beyond a school assignment; we will be doing something transcendent.

Now. How do we get them to write those stories? I’m going to share the process I used for teaching narrative writing. I used this process with middle school students, but it would work with most age groups.

A Note About Form: Personal Narrative or Short Story?

When teaching narrative writing, many teachers separate personal narratives from short stories. In my own classroom, I tended to avoid having my students write short stories because personal narratives were more accessible. I could usually get students to write about something that really happened, while it was more challenging to get them to make something up from scratch.

In the “real” world of writers, though, the main thing that separates memoir from fiction is labeling: A writer might base a novel heavily on personal experiences, but write it all in third person and change the names of characters to protect the identities of people in real life. Another writer might create a short story in first person that reads like a personal narrative, but is entirely fictional. Just last weekend my husband and I watched the movie Lion and were glued to the screen the whole time, knowing it was based on a true story. James Frey’s book  A Million Little Pieces  sold millions of copies as a memoir but was later found to contain more than a little bit of fiction. Then there are unique books like Curtis Sittenfeld’s brilliant novel American Wife , based heavily on the early life of Laura Bush but written in first person, with fictional names and settings, and labeled as a work of fiction. The line between fact and fiction has always been really, really blurry, but the common thread running through all of it is good storytelling.

With that in mind, the process for teaching narrative writing can be exactly the same for writing personal narratives or short stories; it’s the same skill set. So if you think your students can handle the freedom, you might decide to let them choose personal narrative or fiction for a narrative writing assignment, or simply tell them that whether the story is true doesn’t matter, as long as they are telling a good story and they are not trying to pass off a fictional story as fact.

Here are some examples of what that kind of flexibility could allow:

  • A student might tell a true story from their own experience, but write it as if it were a fiction piece, with fictional characters, in third person.
  • A student might create a completely fictional story, but tell it in first person, which would give it the same feel as a personal narrative.
  • A student might tell a true story that happened to someone else, but write it in first person, as if they were that person. For example, I could write about my grandmother’s experience of getting lost as a child, but I might write it in her voice.

If we aren’t too restrictive about what we call these pieces, and we talk about different possibilities with our students, we can end up with lots of interesting outcomes. Meanwhile, we’re still teaching students the craft of narrative writing.

A Note About Process: Write With Your Students

One of the most powerful techniques I used as a writing teacher was to do my students’ writing assignments with them. I would start my own draft at the same time as they did, composing “live” on the classroom projector, and doing a lot of thinking out loud so they could see all the decisions a writer has to make.

The most helpful parts for them to observe were the early drafting stage, where I just scratched out whatever came to me in messy, run-on sentences, and the revision stage, where I crossed things out, rearranged, and made tons of notes on my writing. I have seen over and over again how witnessing that process can really help to unlock a student’s understanding of how writing actually gets made.

A Narrative Writing Unit Plan

Before I get into these steps, I should note that there is no one right way to teach narrative writing, and plenty of accomplished teachers are doing it differently and getting great results. This just happens to be a process that has worked for me.

Step 1: Show Students That Stories Are Everywhere

Getting our students to tell stories should be easy. They hear and tell stories all the time. But when they actually have to put words on paper, they forget their storytelling abilities: They can’t think of a topic. They omit relevant details, but go on and on about irrelevant ones. Their dialogue is bland. They can’t figure out how to start. They can’t figure out how to end.

So the first step in getting good narrative writing from students is to help them see that they are already telling stories every day . They gather at lockers to talk about that thing that happened over the weekend. They sit at lunch and describe an argument they had with a sibling. Without even thinking about it, they begin sentences with “This one time…” and launch into stories about their earlier childhood experiences. Students are natural storytellers; learning how to do it well on paper is simply a matter of studying good models, then imitating what those writers do.

So start off the unit by getting students to tell their stories. In journal quick-writes, think-pair-shares, or by playing a game like Concentric Circles , prompt them to tell some of their own brief stories: A time they were embarrassed. A time they lost something. A time they didn’t get to do something they really wanted to do. By telling their own short anecdotes, they will grow more comfortable and confident in their storytelling abilities. They will also be generating a list of topic ideas. And by listening to the stories of their classmates, they will be adding onto that list and remembering more of their own stories.

And remember to tell some of your own. Besides being a good way to bond with students, sharing  your stories will help them see more possibilities for the ones they can tell.

Step 2: Study the Structure of a Story

Now that students have a good library of their own personal stories pulled into short-term memory, shift your focus to a more formal study of what a story looks like.

Use a diagram to show students a typical story arc like the one below. Then, using a simple story (try a video like The Present or Room ), fill out the story arc with the components from that story. Once students have seen this story mapped out, have them try it with another one, like a story you’ve read in class, a whole novel, or another short video.

learning intention for essay writing

Step 3: Introduce the Assignment

Up to this point, students have been immersed in storytelling. Now give them specific instructions for what they are going to do. Share your assignment rubric so they understand the criteria that will be used to evaluate them; it should be ready and transparent right from the beginning of the unit. As always, I recommend using a single point rubric for this.

Step 4: Read Models

Once the parameters of the assignment have been explained, have students read at least one model story, a mentor text that exemplifies the qualities you’re looking for. This should be a story on a topic your students can kind of relate to, something they could see themselves writing. For my narrative writing unit (see the end of this post), I wrote a story called “Frog” about a 13-year-old girl who finally gets to stay home alone, then finds a frog in her house and gets completely freaked out, which basically ruins the fun she was planning for the night.

They will be reading this model as writers, looking at how the author shaped the text for a purpose, so that they can use those same strategies in their own writing. Have them look at your rubric and find places in the model that illustrate the qualities listed in the rubric. Then have them complete a story arc for the model so they can see the underlying structure.

Ideally, your students will have already read lots of different stories to look to as models. If that isn’t the case, this list of narrative texts recommended by Cult of Pedagogy followers on Twitter would be a good place to browse for titles that might be right for your students. Keep in mind that we have not read most of these stories, so be sure to read them first before adopting them for classroom use.

learning intention for essay writing

Step 5: Story Mapping

At this point, students will need to decide what they are going to write about. If they are stuck for a topic, have them just pick something they can write about, even if it’s not the most captivating story in the world. A skilled writer could tell a great story about deciding what to have for lunch. If they are using the skills of narrative writing, the topic isn’t as important as the execution.

Have students complete a basic story arc for their chosen topic using a diagram like the one below. This will help them make sure that they actually have a story to tell, with an identifiable problem, a sequence of events that build to a climax, and some kind of resolution, where something is different by the end. Again, if you are writing with your students, this would be an important step to model for them with your own story-in-progress.

learning intention for essay writing

Step 6: Quick Drafts

Now, have students get their chosen story down on paper as quickly as possible: This could be basically a long paragraph that would read almost like a summary, but it would contain all the major parts of the story. Model this step with your own story, so they can see that you are not shooting for perfection in any way. What you want is a working draft, a starting point, something to build on for later, rather than a blank page (or screen) to stare at.

Step 7: Plan the Pacing

Now that the story has been born in raw form, students can begin to shape it. This would be a good time for a lesson on pacing, where students look at how writers expand some moments to create drama and shrink other moments so that the story doesn’t drag. Creating a diagram like the one below forces a writer to decide how much space to devote to all of the events in the story.

learning intention for essay writing

Step 8: Long Drafts

With a good plan in hand, students can now slow down and write a proper draft, expanding the sections of their story that they plan to really draw out and adding in more of the details that they left out in the quick draft.

Step 9: Workshop

Once students have a decent rough draft—something that has a basic beginning, middle, and end, with some discernible rising action, a climax of some kind, and a resolution, you’re ready to shift into full-on workshop mode. I would do this for at least a week: Start class with a short mini-lesson on some aspect of narrative writing craft, then give students the rest of the period to write, conference with you, and collaborate with their peers. During that time, they should focus some of their attention on applying the skill they learned in the mini-lesson to their drafts, so they will improve a little bit every day.

Topics for mini-lessons can include:

  • How to weave exposition into your story so you don’t give readers an “information dump”
  • How to carefully select dialogue to create good scenes, rather than quoting everything in a conversation
  • How to punctuate and format dialogue so that it imitates the natural flow of a conversation
  • How to describe things using sensory details and figurative language; also,  what  to describe…students too often give lots of irrelevant detail
  • How to choose precise nouns and vivid verbs, use a variety of sentence lengths and structures, and add transitional words, phrases, and features to help the reader follow along
  • How to start, end, and title a story

Step 10: Final Revisions and Edits

As the unit nears its end, students should be shifting away from revision , in which they alter the content of a piece, toward editing , where they make smaller changes to the mechanics of the writing. Make sure students understand the difference between the two: They should not be correcting each other’s spelling and punctuation in the early stages of this process, when the focus should be on shaping a better story.

One of the most effective strategies for revision and editing is to have students read their stories out loud. In the early stages, this will reveal places where information is missing or things get confusing. Later, more read-alouds will help them immediately find missing words, unintentional repetitions, and sentences that just “sound weird.” So get your students to read their work out loud frequently. It also helps to print stories on paper: For some reason, seeing the words in print helps us notice things we didn’t see on the screen.

To get the most from peer review, where students read and comment on each other’s work, more modeling from you is essential: Pull up a sample piece of writing and show students how to give specific feedback that helps, rather than simply writing “good detail” or “needs more detail,” the two comments I saw exchanged most often on students’ peer-reviewed papers.

Step 11: Final Copies and Publication

Once revision and peer review are done, students will hand in their final copies. If you don’t want to get stuck with 100-plus papers to grade, consider using Catlin Tucker’s station rotation model , which keeps all the grading in class. And when you do return stories with your own feedback, try using Kristy Louden’s delayed grade strategy , where students don’t see their final grade until they have read your written feedback.

Beyond the standard hand-in-for-a-grade, consider other ways to have students publish their stories. Here are some options:

  • Stories could be published as individual pages on a collaborative website or blog.
  • Students could create illustrated e-books out of their stories.
  • Students could create a slideshow to accompany their stories and record them as digital storytelling videos. This could be done with a tool like Screencastify or Screencast-O-Matic .

So this is what worked for me. If you’ve struggled to get good stories from your students, try some or all of these techniques next time. I think you’ll find that all of your students have some pretty interesting stories to tell. Helping them tell their stories well is a gift that will serve them for many years after they leave your classroom. ♦

Want this unit ready-made?

If you’re a writing teacher in grades 7-12 and you’d like a classroom-ready unit like the one described above, including slideshow mini-lessons on 14 areas of narrative craft, a sample narrative piece, editable rubrics, and other supplemental materials to guide students through every stage of the process, take a look at my Narrative Writing unit . Just click on the image below and you’ll be taken to a page where you can read more and see a detailed preview of what’s included.

learning intention for essay writing

What to Read Next

learning intention for essay writing

Categories: Instruction , Podcast

Tags: English language arts , Grades 6-8 , Grades 9-12 , teaching strategies

52 Comments

' src=

Wow, this is a wonderful guide! If my English teachers had taught this way, I’m sure I would have enjoyed narrative writing instead of dreading it. I’ll be able to use many of these suggestions when writing my blog! BrP

' src=

Lst year I was so discouraged because the short stories looked like the quick drafts described in this article. I thought I had totally failed until I read this and realized I did not fai,l I just needed to complete the process. Thank you!

' src=

I feel like you jumped in my head and connected my thoughts. I appreciate the time you took to stop and look closely at form. I really believe that student-writers should see all dimensions of narrative writing and be able to live in whichever style and voice they want for their work.

' src=

Can’t thank you enough for this. So well curated that one can just follow it blindly and ace at teaching it. Thanks again!

' src=

Great post! I especially liked your comments about reminding kids about the power of storytelling. My favourite podcasts and posts from you are always about how to do things in the classroom and I appreciate the research you do.

On a side note, the ice breakers are really handy. My kids know each other really well (rural community), and can tune out pretty quickly if there is nothing new to learn about their peers, but they like the games (and can remember where we stopped last time weeks later). I’ve started changing them up with ‘life questions’, so the editable version is great!

' src=

I love writing with my students and loved this podcast! A fun extension to this narrative is to challenge students to write another story about the same event, but use the perspective of another “character” from the story. Books like Wonder (R.J. Palacio) and Wanderer (Sharon Creech) can model the concept for students.

' src=

Thank you for your great efforts to reveal the practical writing strategies in layered details. As English is not my first language, I need listen to your podcast and read the text repeatedly so to fully understand. It’s worthy of the time for some great post like yours. I love sharing so I send the link to my English practice group that it can benefit more. I hope I could be able to give you some feedback later on.

' src=

Thank you for helping me get to know better especially the techniques in writing narrative text. Im an English teacher for 5years but have little knowledge on writing. I hope you could feature techniques in writing news and fearute story. God bless and more power!

' src=

Thank you for this! I am very interested in teaching a unit on personal narrative and this was an extremely helpful breakdown. As a current student teacher I am still unsure how to approach breaking down the structures of different genres of writing in a way that is helpful for me students but not too restrictive. The story mapping tools you provided really allowed me to think about this in a new way. Writing is such a powerful way to experience the world and more than anything I want my students to realize its power. Stories are how we make sense of the world and as an English teacher I feel obligated to give my students access to this particular skill.

' src=

The power of story is unfathomable. There’s this NGO in India doing some great work in harnessing the power of storytelling and plots to brighten children’s lives and enlighten them with true knowledge. Check out Katha India here: http://bit.ly/KathaIndia

' src=

Thank you so much for this. I did not go to college to become a writing professor, but due to restructuring in my department, I indeed am! This is a wonderful guide that I will use when teaching the narrative essay. I wonder if you have a similar guide for other modes such as descriptive, process, argument, etc.?

' src=

Hey Melanie, Jenn does have another guide on writing! Check out A Step-by-Step Plan for Teaching Argumentative Writing .

' src=

Hi, I am also wondering if there is a similar guide for descriptive writing in particular?

Hey Melanie, unfortunately Jenn doesn’t currently have a guide for descriptive writing. She’s always working on projects though, so she may get around to writing a unit like this in the future. You can always check her Teachers Pay Teachers page for an up-to-date list of materials she has available. Thanks!

' src=

I want to write about the new character in my area

' src=

That’s great! Let us know if you need any supports during your writing process!

' src=

I absolutely adore this unit plan. I teach freshmen English at a low-income high school and wanted to find something to help my students find their voice. It is not often that I borrow material, but I borrowed and adapted all of it in the order that it is presented! It is cohesive, understandable, and fun. Thank you!!

' src=

So glad to hear this, Nicole!

' src=

Thanks sharing this post. My students often get confused between personal narratives and short stories. Whenever I ask them to write a short story, she share their own experiences and add a bit of fiction in it to make it interesting.

' src=

Thank you! My students have loved this so far. I do have a question as to where the “Frog” story mentioned in Step 4 is. I could really use it! Thanks again.

This is great to hear, Emily! In Step 4, Jenn mentions that she wrote the “Frog” story for her narrative writing unit . Just scroll down the bottom of the post and you’ll see a link to the unit.

' src=

I also cannot find the link to the short story “Frog”– any chance someone can send it or we can repost it?

This story was written for Jenn’s narrative writing unit. You can find a link to this unit in Step 4 or at the bottom of the article. Hope this helps.

' src=

I cannot find the frog story mentioned. Could you please send the link.? Thank you

Hi Michelle,

The Frog story was written for Jenn’s narrative writing unit. There’s a link to this unit in Step 4 and at the bottom of the article.

Debbie- thanks for you reply… but there is no link to the story in step 4 or at the bottom of the page….

Hey Shawn, the frog story is part of Jenn’s narrative writing unit, which is available on her Teachers Pay Teachers site. The link Debbie is referring to at the bottom of this post will take you to her narrative writing unit and you would have to purchase that to gain access to the frog story. I hope this clears things up.

' src=

Thank you so much for this resource! I’m a high school English teacher, and am currently teaching creative writing for the first time. I really do value your blog, podcast, and other resources, so I’m excited to use this unit. I’m a cyber school teacher, so clear, organized layout is important; and I spend a lot of time making sure my content is visually accessible for my students to process. Thanks for creating resources that are easy for us teachers to process and use.

' src=

Do you have a lesson for Informative writing?

Hey Cari, Jenn has another unit on argumentative writing , but doesn’t have one yet on informative writing. She may develop one in the future so check back in sometime.

' src=

I had the same question. Informational writing is so difficult to have a good strong unit in when you have so many different text structures to meet and need text-dependent writing tasks.

Creating an informational writing unit is still on Jenn’s long list of projects to get to, but in the meantime, if you haven’t already, check out When We All Teach Text Structures, Everyone Wins . It might help you out!

' src=

This is a great lesson! It would be helpful to see a finished draft of the frog narrative arc. Students’ greatest challenge is transferring their ideas from the planner to a full draft. To see a full sample of how this arc was transformed into a complete narrative draft would be a powerful learning tool.

Hi Stacey! Jenn goes into more depth with the “Frog” lesson in her narrative writing unit – this is where you can find a sample of what a completed story arc might look. Also included is a draft of the narrative. If interested in checking out the unit and seeing a preview, just scroll down to the bottom of the post and click on the image. Hope this helps!

' src=

Helped me learn for an entrance exam thanks very much

' src=

Is the narrative writing lesson you talk about in https://www.cultofpedagogy.com/narrative-writing/

Also doable for elementary students you think, and if to what levels?

Love your work, Sincerely, Zanyar

Hey Zanyar,

It’s possible the unit would work with 4th and 5th graders, but Jenn definitely wouldn’t recommend going any younger. The main reason for this is that some of the mini-lessons in the unit could be challenging for students who are still concrete thinkers. You’d likely need to do some adjusting and scaffolding which could extend the unit beyond the 3 weeks. Having said that, I taught 1st grade and found the steps of the writing process, as described in the post, to be very similar. Of course learning targets/standards were different, but the process itself can be applied to any grade level (modeling writing, using mentor texts to study how stories work, planning the structure of the story, drafting, elaborating, etc.) Hope this helps!

' src=

This has made my life so much easier. After teaching in different schools systems, from the American, to British to IB, one needs to identify the anchor standards and concepts, that are common between all these systems, to build well balanced thematic units. Just reading these steps gave me the guidance I needed to satisfy both the conceptual framework the schools ask for and the standards-based practice. Thank you Thank you.

' src=

Would this work for teaching a first grader about narrative writing? I am also looking for a great book to use as a model for narrative writing. Veggie Monster is being used by his teacher and he isn’t connecting with this book in the least bit, so it isn’t having a positive impact. My fear is he will associate this with writing and I don’t want a negative association connected to such a beautiful process and experience. Any suggestions would be helpful.

Thank you for any information you can provide!

Although I think the materials in the actual narrative writing unit are really too advanced for a first grader, the general process that’s described in the blog post can still work really well.

I’m sorry your child isn’t connecting with The Night of the Veggie Monster. Try to keep in mind that the main reason this is used as a mentor text is because it models how a small moment story can be told in a big way. It’s filled with all kinds of wonderful text features that impact the meaning of the story – dialogue, description, bold text, speech bubbles, changes in text size, ellipses, zoomed in images, text placement, text shape, etc. All of these things will become mini-lessons throughout the unit. But there are lots of other wonderful mentor texts that your child might enjoy. My suggestion for an early writer, is to look for a small moment text, similar in structure, that zooms in on a problem that a first grader can relate to. In addition to the mentor texts that I found in this article , you might also want to check out Knuffle Bunny, Kitten’s First Full Moon, When Sophie Gets Angry Really Really Angry, and Whistle for Willie. Hope this helps!

' src=

I saw this on Pinterest the other day while searching for examples of narritives units/lessons. I clicked on it because I always click on C.o.P stuff 🙂 And I wasn’t disapointed. I was intrigued by the connection of narratives to humanity–even if a student doesn’t identify as a writer, he/she certainly is human, right? I really liked this. THIS clicked with me.

A few days after I read the P.o.C post, I ventured on to YouTube for more ideas to help guide me with my 8th graders’ narrative writing this coming spring. And there was a TEDx video titled, “The Power of Personal Narrative” by J. Christan Jensen. I immediately remembered the line from the article above that associated storytelling with “power” and how it sets humans apart and if introduced and taught as such, it can be “extraordinary.”

I watched the video and to the suprise of my expectations, it was FANTASTIC. Between Jennifer’s post and the TEDx video ignited within me some major motivation and excitement to begin this unit.

' src=

Thanks for sharing this with us! So glad that Jenn’s post paired with another text gave you some motivation and excitement. I’ll be sure to pass this on to Jenn!

' src=

Thank you very much for this really helpful post! I really love the idea of helping our students understand that storytelling is powerful and then go on to teach them how to harness that power. That is the essence of teaching literature or writing at any level. However, I’m a little worried about telling students that whether a piece of writing is fact or fiction does not matter. It in fact matters a lot precisely because storytelling is powerful. Narratives can shape people’s views and get their emotions involved which would, in turn, motivate them to act on a certain matter, whether for good or for bad. A fictional narrative that is passed as factual could cause a lot of damage in the real world. I believe we should. I can see how helping students focus on writing the story rather than the truth of it all could help refine the needed skills without distractions. Nevertheless, would it not be prudent to teach our students to not just harness the power of storytelling but refrain from misusing it by pushing false narratives as factual? It is true that in reality, memoirs pass as factual while novels do as fictional while the opposite may be true for both cases. I am not too worried about novels passing as fictional. On the other hand, fictional narratives masquerading as factual are disconcerting and part of a phenomenon that needs to be fought against, not enhanced or condoned in education. This is especially true because memoirs are often used by powerful people to write/re-write history. I would really like to hear your opinion on this. Thanks a lot for a great post and a lot of helpful resources!

Thank you so much for this. Jenn and I had a chance to chat and we can see where you’re coming from. Jenn never meant to suggest that a person should pass off a piece of fictional writing as a true story. Good stories can be true, completely fictional, or based on a true story that’s mixed with some fiction – that part doesn’t really matter. However, what does matter is how a student labels their story. We think that could have been stated more clearly in the post , so Jenn decided to add a bit about this at the end of the 3rd paragraph in the section “A Note About Form: Personal Narrative or Short Story?” Thanks again for bringing this to our attention!

' src=

You have no idea how much your page has helped me in so many ways. I am currently in my teaching credential program and there are times that I feel lost due to a lack of experience in the classroom. I’m so glad I came across your page! Thank you for sharing!

Thanks so much for letting us know-this means a whole lot!

' src=

No, we’re sorry. Jenn actually gets this question fairly often. It’s something she considered doing at one point, but because she has so many other projects she’s working on, she’s just not gotten to it.

' src=

I couldn’t find the story

' src=

Hi, Duraiya. The “Frog” story is part of Jenn’s narrative writing unit, which is available on her Teachers Pay Teachers site. The link at the bottom of this post will take you to her narrative writing unit, which you can purchase to gain access to the story. I hope this helps!

' src=

I am using this step-by-step plan to help me teach personal narrative story writing. I wanted to show the Coca-Cola story, but the link says the video is not available. Do you have a new link or can you tell me the name of the story so I can find it?

Thank you for putting this together.

Hi Corri, sorry about that. The Coca-Cola commercial disappeared, so Jenn just updated the post with links to two videos with good stories. Hope this helps!

Leave a Reply

Your email address will not be published.

Home

You are here

Writing an argument - learning intention guide.

  • This assessment schedule could be used with the following resources: Persuasive language III ,  Persuasive speech II ,  P.E. - is it worth it? ,  Organ donation ,  Single Sex Education ,  School Uniforms  , or with teacher-developed assessment tasks.
  • This assessment guide could be used for either self- or peer-assessment purposes, or a combination of both.
  • After selecting the criteria for the focus of the teaching and learning, teachers can print off a guide without the reflection boxes (which are present on the student's guide).
  • The teacher's guide could be enlarged as a chart for sharing, and/or for working up examples in the contexts relevant to students.
  • Students should be familiar with how to self- and/or peer-assess before using this guide, and with the features of an argument.
  • Ideally, the assessment would be followed-up with a teacher conference.
  • The 'next time' section of the assessment guide is for students to set their next goals. This section could be glued into the student's work book as a record.
  • When explaining to students how to complete the assessment task, teachers could include the following points: 
  • Use the assessment guide to help you plan and write your argument.
  • Write your argument.
  • When you have finished use the guide to assess and reflect on your work.

  

  • Writing an explanation - Learning intention guide
  • Learning intention guides
  • Proofreading your writing - Learning intention guide
  • Writing a report - Learning intention guide
  • Writing a recount - Learning intention guide
  • Writing instructions - Learning intention guide
  • Editing your writing - Learning intention guide

What Is Learning? Essay about Learning Importance

  • To find inspiration for your paper and overcome writer’s block
  • As a source of information (ensure proper referencing)
  • As a template for you assignment

What Is learning? 👨‍🎓️ Why is learning important? Find the answers here! 🔤 This essay on learning describes its outcomes and importance in one’s life.

Introduction

  • The Key Concepts

Learning is a continuous process that involves the transformation of information and experience into abilities and knowledge. Learning, according to me, is a two way process that involves the learner and the educator leading to knowledge acquisition as well as capability.

It informs my educational sector by making sure that both the students and the teacher participate during the learning process to make it more real and enjoyable so that the learners can clearly understand. There are many and different learning concepts held by students and ways in which the different views affect teaching and learning.

What Is Learning? The Key Concepts

One of the learning concept held by students is, presentation of learning material that is precise. This means that any material that is meant for learning should be very clear put in a language that the learners comprehend (Blackman & Benson 2003). The material should also be detailed with many examples that are relevant to the prior knowledge of the learner.

This means that the learner must have pertinent prior knowledge. This can be obtained by the teacher explaining new ideas and words that are to be encountered in a certain field or topic that might take more consecutive lessons. Different examples assist the students in approaching ideas in many perspectives.

The learner is able to get similarities from the many examples given thus leading to a better understanding of a concept since the ideas are related and linked.

Secondly, new meanings should be incorporated into the students’ prior knowledge, instead of remembering only the definitions or procedures. Therefore, to promote expressive learning, instructional methods that relate new information to the learner’s prior knowledge should be used.

Moreover, significant learning involves the use of evaluation methods that inspire learners to relate their existing knowledge with new ideas. For the students to comprehend complex ideas, they must be combined with the simple ideas they know.

Teaching becomes very easy when a lesson starts with simple concepts that the students are familiar with. The students should start by understanding what they know so that they can use the ideas in comprehending complex concepts. This makes learning smooth and easy for both the learner and the educator (Chermak& Weiss 1999).

Thirdly, acquisition of the basic concepts is very essential for the student to understand the threshold concepts. This is because; the basic concepts act as a foundation in learning a certain topic or procedure. So, the basic concepts must be comprehended first before proceeding to the incorporation of the threshold concepts.

This makes the student to have a clear understanding of each stage due to the possession of initial knowledge (Felder &Brent 1996). A deeper foundation of the study may also be achieved through getting the differences between various concepts clearly and by knowing the necessary as well as the unnecessary aspects. Basic concepts are normally taught in the lower classes of each level.

They include defining terms in each discipline. These terms aid in teaching in all the levels because they act as a foundation. The stage of acquiring the basics determines the students’ success in the rest of their studies.

This is because lack of basics leads to failure since the students can not understand the rest of the context in that discipline, which depends mostly on the basics. For learning to become effective to the students, the basics must be well understood as well as their applications.

Learning by use of models to explain certain procedures or ideas in a certain discipline is also another learning concept held by students. Models are helpful in explaining complex procedures and they assist the students in understanding better (Blackman & Benson 2003).

For instance, in economics, there are many models that are used by the students so that they can comprehend the essential interrelationships in that discipline. A model known as comparative static is used by the students who do economics to understand how equilibrium is used in economic reason as well as the forces that bring back equilibrium after it has been moved.

The students must know the importance of using such kind of models, the main aspect in the model and its relationship with the visual representation. A model is one of the important devices that must be used by a learner to acquire knowledge. They are mainly presented in a diagram form using symbols or arrows.

It simplifies teaching especially to the slow learners who get the concept slowly but clearly. It is the easiest and most effective method of learning complex procedures or directions. Most models are in form of flowcharts.

Learners should get used to learning incomplete ideas so that they can make more complete ideas available to them and enjoy going ahead. This is because, in the process of acquiring the threshold concepts, the prior knowledge acquired previously might be transformed.

So, the students must be ready to admit that every stage in the learning process they get an understanding that is temporary. This problem intensifies when the understanding of an idea acquired currently changes the understanding of an idea that had been taught previously.

This leads to confusion that can make the weak students lose hope. That is why the teacher should always state clear similarities as well as differences of various concepts. On the other hand, the student should be able to compare different concepts and stating their similarities as well as differences (Watkins & Regmy 1992).

The student should also be careful when dealing with concepts that seem similar and must always be attentive to get the first hand information from the teacher. Teaching and learning becomes very hard when learners do not concentrate by paying attention to what the teacher is explaining. For the serious students, learning becomes enjoyable and they do not get confused.

According to Chemkar and Weiss (1999), learners must not just sit down and listen, but they must involve themselves in some other activities such as reading, writing, discussing or solving problems. Basically, they must be very active and concentrate on what they are doing. These techniques are very essential because they have a great impact to the learners.

Students always support learning that is active than the traditional lecture methods because they master the content well and aids in the development of most skills such as writing and reading. So methods that enhance active learning motivate the learners since they also get more information from their fellow learners through discussions.

Students engage themselves in discussion groups or class presentations to break the monotony of lecture method of learning. Learning is a two way process and so both the teacher and the student must be involved.

Active learning removes boredom in the class and the students get so much involved thus improving understanding. This arouses the mind of the student leading to more concentration. During a lecture, the student should write down some of the important points that can later be expounded on.

Involvement in challenging tasks by the learners is so much important. The task should not be very difficult but rather it should just be slightly above the learner’s level of mastery. This makes the learner to get motivated and instills confidence. It leads to success of the learner due to the self confidence that aids in problem solving.

For instance, when a learner tackles a question that deemed hard and gets the answer correct, it becomes the best kind of encouragement ever. The learner gets the confidence that he can make it and this motivates him to achieve even more.

This kind of encouragement mostly occurs to the quick learners because the slow learners fail in most cases. This makes the slow learners fear tackling many problems. So, the concept might not apply to all the learners but for the slow learners who are determined, they can always seek for help incase of such a problem.

Moreover, another concept held by students is repetition because, the most essential factor in learning is efficient time in a task. For a student to study well he or she should consider repetition, that is, looking at the same material over and over again.

For instance, before a teacher comes for the lesson, the student can review notes and then review the same notes after the teacher gets out of class. So, the student reviews the notes many times thus improving the understanding level (Felder & Brent 1996). This simplifies revising for an exam because the student does not need to cram for it.

Reviewing the same material makes teaching very easy since the teacher does not need to go back to the previous material and start explaining again. It becomes very hard for those students who do not review their work at all because they do not understand the teacher well and are faced by a hard time when preparing for examinations.

Basically, learning requires quite enough time so that it can be effective. It also becomes a very big problem for those who do not sacrifice their time in reviews.

Acquisition of the main points improves understanding of the material to the student. Everything that is learnt or taught may not be of importance. Therefore, the student must be very keen to identify the main points when learning. These points should be written down or underlined because they become useful when reviewing notes before doing an exam. It helps in saving time and leads to success.

For those students who do not pay attention, it becomes very difficult for them to highlight the main points. They read for the sake of it and make the teacher undergo a very hard time during teaching. To overcome this problem, the students must be taught how to study so that learning can be effective.

Cooperative learning is also another concept held by the students. It is more detailed than a group work because when used properly, it leads to remarkable results. This is very encouraging in teaching and the learning environment as well.

The students should not work with their friends so that learning can be productive, instead every group should have at least one top level student who can assist the weak students. The groups assist them in achieving academic as well as social abilities due to the interaction. This learning concept benefits the students more because, a fellow student can explain a concept in a better way than how the teacher can explain in class.

Assignments are then given to these groups through a selected group leader (Felder& Brent 1996). Every member must be active in contributing ideas and respect of one’s ideas is necessary. It becomes very easy for the teacher to mark such kind of assignments since they are fewer than marking for each individual.

Learning becomes enjoyable because every student is given a chance to express his or her ideas freely and in a constructive manner. Teaching is also easier because the students encounter very many new ideas during the discussions. Some students deem it as time wastage but it is necessary in every discipline.

Every group member should be given a chance to become the group’s facilitator whose work is to distribute and collect assignments. Dormant students are forced to become active because every group member must contribute his or her points. Cooperative learning is a concept that requires proper planning and organization.

Completion of assignments is another student held learning concept. Its main aim is to assist the student in knowing whether the main concepts in a certain topic were understood. This acts as a kind of self evaluation to the student and also assists the teacher to know whether the students understood a certain topic. The assignments must be submitted to the respective teacher for marking.

Those students who are focused follow the teacher after the assignments have been marked for clarification purposes. This enhances learning and the student understands better. Many students differ with this idea because they do not like relating with the teacher (Marton &Beaty 1993). This leads to very poor grades since communication is a very essential factor in learning.

Teaching becomes easier and enjoyable when there is a student- teacher relationship. Assignment corrections are necessary to both the student and the teacher since the student comprehends the right method of solving a certain problem that he or she could not before.

Lazy students who do not do corrections make teaching hard for the teacher because they make the other students to lag behind. Learning may also become ineffective for them due to low levels of understanding.

Acquisition of facts is still another student held concept that aims at understanding reality. Students capture the essential facts so that they can understand how they suit in another context. Many students fail to obtain the facts because they think that they can get everything taught in class or read from books.

When studying, the student must clearly understand the topic so that he or she can develop a theme. This helps in making short notes by eliminating unnecessary information. So, the facts must always be identified and well understood in order to apply them where necessary. Teaching becomes easier when the facts are well comprehended by the students because it enhances effective learning.

Effective learning occurs when a student possesses strong emotions. A strong memory that lasts for long is linked with the emotional condition of the learner. This means that the learners will always remember well when learning is incorporated with strong emotions. Emotions develop when the students have a positive attitude towards learning (Marton& Beaty 1993).

This is because they will find learning enjoyable and exciting unlike those with a negative attitude who will find learning boring and of no use to them. Emotions affect teaching since a teacher will like to teach those students with a positive attitude towards what he is teaching rather than those with a negative attitude.

The positive attitude leads to effective learning because the students get interested in what they are learning and eventually leads to success. Learning does not become effective where students portray a negative attitude since they are not interested thus leading to failure.

Furthermore, learning through hearing is another student held concept. This concept enables them to understand what they hear thus calling for more attention and concentration. They prefer instructions that are given orally and are very keen but they also participate by speaking. Teaching becomes very enjoyable since the students contribute a lot through talking and interviewing.

Learning occurs effectively because the students involve themselves in oral reading as well as listening to recorded information. In this concept, learning is mostly enhanced by debating, presenting reports orally and interviewing people. Those students who do not prefer this concept as a method of learning do not involve themselves in debates or oral discussions but use other learning concepts.

Learners may also use the concept of seeing to understand better. This makes them remember what they saw and most of them prefer using written materials (Van Rosum & Schenk 1984). Unlike the auditory learners who grasp the concept through hearing, visual learners understand better by seeing.

They use their sight to learn and do it quietly. They prefer watching things like videos and learn from what they see. Learning occurs effectively since the memory is usually connected with visual images. Teaching becomes very easy when visual images are incorporated. They include such things like pictures, objects, graphs.

A teacher can use charts during instruction thus improving the students’ understanding level or present a demonstration for the students to see. Diagrams are also necessary because most students learn through seeing.

Use of visual images makes learning to look real and the student gets the concept better than those who learn through imaginations. This concept makes the students to use text that has got many pictures, diagrams, graphics, maps and graphs.

In learning students may also use the tactile concept whereby they gain knowledge and skills through touching. They gain knowledge mostly through manipulative. Teaching becomes more effective when students are left to handle equipments for themselves for instance in a laboratory practical. Students tend to understand better because they are able to follow instructions (Watkins & Regmy 1992).

After applying this concept, the students are able to engage themselves in making perfect drawings, making models and following procedures to make something. Learning may not take place effectively to those students who do not like manipulating because it arouses the memory and the students comprehends the concept in a better way.

Learning through analysis is also another concept held by students because they are able to plan their work in an organized manner which is based on logic ideas only. It requires individual learning and effective learning occurs when information is given in steps. This makes the teacher to structure the lessons properly and the goals should be clear.

This method of organizing ideas makes learning to become effective thus leading to success and achievement of the objectives. Analysis improves understanding of concepts to the learners (Watkins & Regmy 1992). They also understand certain procedures used in various topics because they are sequential.

Teaching and learning becomes very hard for those students who do not know how to analyze their work. Such students learn in a haphazard way thus leading to failure.

If all the learning concepts held by students are incorporated, then remarkable results can be obtained. A lot information and knowledge can be obtained through learning as long as the learner uses the best concepts for learning. Learners are also different because there are those who understand better by seeing while others understand through listening or touching.

So, it is necessary for each learner to understand the best concept to use in order to improve the understanding level. For the slow learners, extra time should be taken while studying and explanations must be clear to avoid confusion. There are also those who follow written instructions better than those instructions that are given orally. Basically, learners are not the same and so require different techniques.

Reference List

Benson, A., & Blackman, D., 2003. Can research methods ever be interesting? Active Learning in Higher Education, Vol. 4, No. 1, 39-55.

Chermak, S., & Weiss, A., 1999. Activity-based learning of statistics: Using practical applications to improve students’ learning. Journal of Criminal Justice Education , Vol. 10, No. 2, pp. 361-371.

Felder, R., & Brent, R., 1996. Navigating the bumpy road to student-centered instruction. College Teaching , Vol. 44, No. 2, pp. 43-47.

Marton, F. & Beaty, E., 1993. Conceptions of learning. International Journal of Educational Research , Vol. 19, pp. 277-300.

Van Rossum, E., & Schenk, S., 1984. The relationship between learning conception, study strategy and learning outcome. British Journal of Educational Psychology , Vol. 54, No.1, pp. 73-85.

Watkins, D., & Regmy, M., 1992. How universal are student conceptions of learning? A Nepalese investigation. Psychologia , Vol. 25, No. 2, pp. 101-110.

What Is Learning? FAQ

  • Why Is Learning Important? Learning means gaining new knowledge, skills, and values, both in a group or on one’s own. It helps a person to develop, maintain their interest in life, and adapt to changes.
  • Why Is Online Learning Good? Online learning has a number of advantages over traditional learning. First, it allows you to collaborate with top experts in your area of interest, no matter where you are located geographically. Secondly, it encourages independence and helps you develop time management skills. Last but not least, it saves time on transport.
  • How to Overcome Challenges in Online Learning? The most challenging aspects of distant learning are the lack of face-to-face communication and the lack of feedback. The key to overcoming these challenges is effective communication with teachers and classmates through videoconferencing, email, and chats.
  • How to Motivate Students to Learn Essay
  • Apple’s iBook Using in Schools
  • Narrative Essay as a Teaching Instrument
  • Concept of Learning Geometry in School
  • Distance Learning OL and Interactive Video in Higher Education
  • Taxonomy of Learning Objectives
  • Importance of social interaction to learning
  • Comparing learning theories
  • Chicago (A-D)
  • Chicago (N-B)

IvyPanda. (2019, May 2). What Is Learning? Essay about Learning Importance. https://ivypanda.com/essays/what-is-learning-essay/

"What Is Learning? Essay about Learning Importance." IvyPanda , 2 May 2019, ivypanda.com/essays/what-is-learning-essay/.

IvyPanda . (2019) 'What Is Learning? Essay about Learning Importance'. 2 May.

IvyPanda . 2019. "What Is Learning? Essay about Learning Importance." May 2, 2019. https://ivypanda.com/essays/what-is-learning-essay/.

1. IvyPanda . "What Is Learning? Essay about Learning Importance." May 2, 2019. https://ivypanda.com/essays/what-is-learning-essay/.

Bibliography

IvyPanda . "What Is Learning? Essay about Learning Importance." May 2, 2019. https://ivypanda.com/essays/what-is-learning-essay/.

  • Foundation-2
  • Health and Physical Education
  • Humanities and Social Sciences
  • Digital Downloads
  • Reset Lost Password

learning intention for essay writing

  • English Lesson Plans
  • Year 5 English Lesson Plans
  • Year 6 English Lesson Plans
  • Year 7 English Lesson Plans
  • Year 8 English Lesson Plans

Persuasive Writing Techniques

learning intention for essay writing

A small lesson on understanding the techniques of persuasion and how to get better at persuasive writing. Children act out persuasion techniques in small groups and also sort from most powerful to least powerful.

Australian Curriculum Links:

  • Understand how  authors  often innovate on  text structures  and play with  language features  to achieve particular  aesthetic , humorous and persuasive purposes and effects  (ACELA1518)
  • Select, navigate and  read   texts  for a range of purposes, applying appropriate  text processing strategies  and interpreting structural features, for example table of contents, glossary, chapters, headings and subheadings  (ACELY1712)
  • Plan, draft and publish imaginative, informative and persuasive  texts , choosing and experimenting with text structures ,  language features , images and digital resources appropriate to purpose and audience   (ACELY1714)
  • Reread and edit students’ own and others’ work using agreed criteria and explaining editing choices (ACELY1715)
  • Plan, rehearse and deliver presentations, selecting and sequencing appropriate content and multimodal elements to influence a course of action (ACELY1751)

Lesson Outline:

Timeframe: 45 mins – 1 hour 30 mins

Introduction:

  • Set learning intention on board so students are clear on what they are learning and why (Learning how to use persuasive language to persuade people to believe our thoughts, like politicians and lawyers do outside of school)
  • Watch a variety of election promises from politicians and discuss their strength in persuading you (children) to believe them.
  • Handout Persuasive Language Techniques and ask children to get into groups and look at creating 4 examples for 1 of the techniques (e.g. Group 1 has ‘ATTACKS’ so they have to come up with 4 different examples) to be modeled back to the class (role play)
  • At the end of the role play, give every child 3 sticky notes and list all the Persuasive Language Techniques on the board. Ask small groups to come up and place their 3 sticky notes next to the techniques that they saw as most powerful.
  • Discuss results when all children have finished and clarify which are more powerful if some students haven’t seen it.

Assessment:

  • Success criteria (developed by children at the start):  I know I will be successful when…
  • Anecdotal notes
  • Photo of sticky notes (with names on them) showing that they understand which techniques are more powerful than others.
  • Persuasive Language Techniques   (Word Document)
  • Sticky notes

Print Friendly, PDF & Email

RELATED RESOURCES MORE FROM AUTHOR

learning intention for essay writing

Remote Learning Lesson – Procedural Texts and Fractions with Pizza!

learning intention for essay writing

The Long ‘A’ Sound – A Phonics and Spelling Unit for Years 1-4

learning intention for essay writing

Who Would Win? Writing About Animals

Leave a reply cancel reply.

Log in to leave a comment

STAY CONNECTED

Popular categories.

  • Mathematics Lessons 108
  • English Lesson Plans 102
  • Year 4 Mathematics Lesson Plans 72
  • Year 5 Mathematics Lesson Plans 67
  • Year 3 Mathematics Lesson Plans 61
  • Year 3 English Lesson Plans 57

learning intention for essay writing

Bengal Tigers Teaching Resource – Reading in Grade 1/2

learning intention for essay writing

What’s the Weather Like Today? – A Science/Geography Lesson Plan for...

10 Ways to Detect AI Writing Without Technology

As more of my students have submitted AI-generated work, I’ve gotten better at recognizing it.

10 Ways to Detect AI Writing

AI-generated papers have become regular but unwelcome guests in the undergraduate college courses I teach. I first noticed an AI paper submitted last summer, and in the months since I’ve come to expect to see several per assignment, at least in 100-level classes.

I’m far from the only teacher dealing with this. Turnitin recently announced that in the year since it debuted its AI detection tool, about 3 percent of papers it reviewed were at least 80 percent AI-generated.

Just as AI has improved and grown more sophisticated over the past 9 months, so have teachers. AI often has a distinct writing style with several tells that have become more and more apparent to me the more frequently I encounter any.

Before we get to these strategies, however, it’s important to remember that suspected AI use isn’t immediate grounds for disciplinary action. These cases should be used as conversation starters with students and even – forgive the cliché – as a teachable moment to explain the problems with using AI-generated work.

To that end, I’ve written previously about how I handled these suspected AI cases , the troubling limitations and discriminatory tendencies of existing AI detectors , and about what happens when educators incorrectly accuse students of using AI .

With those caveats firmly in place, here are the signs I look for to detect AI use from my students.

1. How to Detect AI Writing: The Submission is Too Long 

When an assignment asks students for one paragraph and a student turns in more than a page, my spidey sense goes off.

Tech & Learning Newsletter

Tools and ideas to transform education. Sign up below.

Almost every class does have one overachieving student who will do this without AI, but that student usually sends 14 emails the first week and submits every assignment early, and most importantly, while too long, their assignment is often truly well written. A student who suddenly overproduces raises a red flag.

2. The Answer Misses The Mark While Also Being Too Long

Being long in and of itself isn’t enough to identify AI use, but it's often overlong assignments that have additional strange features that can make it suspicious.

For instance, the assignment might be four times the required length yet doesn’t include the required citations or cover page. Or it goes on and on about something related to the topic but doesn’t quite get at the specifics of the actual question asked.

3. AI Writing is Emotionless Even When Describing Emotions 

If ChatGPT was a musician it would be Kenny G or Muzak. As it stands now, AI writing is the equivalent of verbal smooth jazz or grey noise. ChatGPT, for instance, has this very peppy positive vibe that somehow doesn’t convey actual emotion.

One assignment I have asks students to reflect on important memories or favorite hobbies. You immediately sense the hollowness of ChatGPT's response to this kind of prompt. For example, I just told ChatGPT I loved skateboarding as a kid and asked it for an essay describing that. Here’s how ChatGPT started:

As a kid, there was nothing more exhilarating than the feeling of cruising on my skateboard. The rhythmic sound of wheels against pavement, the wind rushing through my hair, and the freedom to explore the world on four wheels – skateboarding was not just a hobby; it was a source of unbridled joy.

You get the point. It’s like an extended elevator jazz sax solo but with words.

4. Cliché Overuse

Part of the reason AI writing is so emotionless is that its cliché use is, well, on steroids.

Take the skateboarding example in the previous entry. Even in the short sample, we see lines such as “the wind rushing through my hair, and the freedom to explore the world on four wheels.” Students, regardless of their writing abilities, always have more original thoughts and ways of seeing the world than that. If a student actually wrote something like that, we’d encourage them to be more authentic and truly descriptive.

Of course, with more prompt adjustments, ChatGPT and other AI’s tools can do better, but the students using AI for assignments rarely put in this extra time.

5. The Assignment Is Submitted Early

I don’t want to cast aspersions on those true overachievers who get their suitcases packed a week before vacation starts, finish winter holiday shopping in July, and have already started saving for retirement, but an early submission may be the first signal that I’m about to read some robot writing.

For example, several students this semester submitted an assignment the moment it became available. That is unusual, and in all of these cases, their writing also exhibited other stylistic points consistent with AI writing.

Warning: Use this tip with caution as it is also true that many of my best students have submitted assignments early over the years.

6. The Setting Is Out of Time

AI image generators frequently have little tells that signal the AI model that created it doesn’t understand what the world actually looks like — think extra fingers on human hands or buildings that don’t really follow the laws of physics.

When AI is asked to write fiction or describe something from a student’s life, similar mistakes often occur. Recently, a short story assignment in one of my classes resulted in several stories that took place in a nebulous time frame that jumped between modern times and the past with no clear purpose.

If done intentionally this could actually be pretty cool and give the stories a kind of magical realism vibe, but in these instances, it was just wonky and out-of-left-field, and felt kind of alien and strange. Or, you know, like a robot had written it.

7. Excessive Use of Lists and Bullet Points  

Here are some reasons that I suspect students are using AI if their papers have many lists or bullet points:

1. ChatGPT and other AI generators frequently present information in list form even though human authors generally know that’s not an effective way to write an essay.

2. Most human writers will not inherently write this way, especially new writers who often struggle with organizing information.

3. While lists can be a good way to organize information, presenting more complex ideas in this manner can be .…

4 … annoying.

5. Do you see what I mean?

6. (Yes, I know, it's ironic that I'm complaining about this here given that this story is also a list.)

8. It’s Mistake-Free 

I’ve criticized ChatGPT’s writing here yet in fairness it does produce very clean prose that is, on average, more error-free than what is submitted by many of my students. Even experienced writers miss commas, have long and awkward sentences, and make little mistakes – which is why we have editors. ChatGPT’s writing isn’t too “perfect” but it’s too clean.

9. The Writing Doesn’t Match The Student’s Other Work  

Writing instructors know this inherently and have long been on the lookout for changes in voice that could be an indicator that a student is plagiarizing work.

AI writing doesn't really change that. When a student submits new work that is wildly different from previous work, or when their discussion board comments are riddled with errors not found in their formal assignments, it's time to take a closer look.

10. Something Is Just . . . Off 

The boundaries between these different AI writing tells blur together and sometimes it's a combination of a few things that gets me to suspect a piece of writing. Other times it’s harder to tell what is off about the writing, and I just get the sense that a human didn’t do the work in front of me.

I’ve learned to trust these gut instincts to a point. When confronted with these more subtle cases, I will often ask a fellow instructor or my department chair to take a quick look (I eliminate identifying student information when necessary). Getting a second opinion helps ensure I’ve not gone down a paranoid “my students are all robots and nothing I read is real” rabbit hole. Once a colleague agrees something is likely up, I’m comfortable going forward with my AI hypothesis based on suspicion alone, in part, because as mentioned previously, I use suspected cases of AI as conversation starters rather than to make accusations.

Again, it is difficult to prove students are using AI and accusing them of doing so is problematic. Even ChatGPT knows that. When I asked it why it is bad to accuse students of using AI to write papers, the chatbot answered: “Accusing students of using AI without proper evidence or understanding can be problematic for several reasons.”

Then it launched into a list.

  • Best Free AI Detection Sites
  • My Student Was Submitting AI Papers. Here's What I Did
  • She Wrote A Book About AI in Education. Here’s How AI Helped

Erik Ofgang is a Tech & Learning contributor. A journalist,  author  and educator, his work has appeared in The New York Times, the Washington Post, the Smithsonian, The Atlantic, and Associated Press. He currently teaches at Western Connecticut State University’s MFA program. While a staff writer at Connecticut Magazine he won a Society of Professional Journalism Award for his education reporting. He is interested in how humans learn and how technology can make that more effective. 

 alt=

Best Printers for Schools

AllHere Ed: How to Use It to Teach

 alt=

IXL Lesson Plan

Most Popular

 alt=

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 June 2024

Applying large language models for automated essay scoring for non-native Japanese

  • Wenchao Li 1 &
  • Haitao Liu 2  

Humanities and Social Sciences Communications volume  11 , Article number:  723 ( 2024 ) Cite this article

129 Accesses

1 Altmetric

Metrics details

  • Language and linguistics

Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated listening tests, and automated oral proficiency assessments. The application of LLMs for AES in the context of non-native Japanese, however, remains limited. This study explores the potential of LLM-based AES by comparing the efficiency of different models, i.e. two conventional machine training technology-based methods (Jess and JWriter), two LLMs (GPT and BERT), and one Japanese local LLM (Open-Calm large model). To conduct the evaluation, a dataset consisting of 1400 story-writing scripts authored by learners with 12 different first languages was used. Statistical analysis revealed that GPT-4 outperforms Jess and JWriter, BERT, and the Japanese language-specific trained Open-Calm large model in terms of annotation accuracy and predicting learning levels. Furthermore, by comparing 18 different models that utilize various prompts, the study emphasized the significance of prompts in achieving accurate and reliable evaluations using LLMs.

Similar content being viewed by others

learning intention for essay writing

Accurate structure prediction of biomolecular interactions with AlphaFold 3

learning intention for essay writing

Testing theory of mind in large language models and humans

learning intention for essay writing

Highly accurate protein structure prediction with AlphaFold

Conventional machine learning technology in aes.

AES has experienced significant growth with the advancement of machine learning technologies in recent decades. In the earlier stages of AES development, conventional machine learning-based approaches were commonly used. These approaches involved the following procedures: a) feeding the machine with a dataset. In this step, a dataset of essays is provided to the machine learning system. The dataset serves as the basis for training the model and establishing patterns and correlations between linguistic features and human ratings. b) the machine learning model is trained using linguistic features that best represent human ratings and can effectively discriminate learners’ writing proficiency. These features include lexical richness (Lu, 2012 ; Kyle and Crossley, 2015 ; Kyle et al. 2021 ), syntactic complexity (Lu, 2010 ; Liu, 2008 ), text cohesion (Crossley and McNamara, 2016 ), and among others. Conventional machine learning approaches in AES require human intervention, such as manual correction and annotation of essays. This human involvement was necessary to create a labeled dataset for training the model. Several AES systems have been developed using conventional machine learning technologies. These include the Intelligent Essay Assessor (Landauer et al. 2003 ), the e-rater engine by Educational Testing Service (Attali and Burstein, 2006 ; Burstein, 2003 ), MyAccess with the InterlliMetric scoring engine by Vantage Learning (Elliot, 2003 ), and the Bayesian Essay Test Scoring system (Rudner and Liang, 2002 ). These systems have played a significant role in automating the essay scoring process and providing quick and consistent feedback to learners. However, as touched upon earlier, conventional machine learning approaches rely on predetermined linguistic features and often require manual intervention, making them less flexible and potentially limiting their generalizability to different contexts.

In the context of the Japanese language, conventional machine learning-incorporated AES tools include Jess (Ishioka and Kameda, 2006 ) and JWriter (Lee and Hasebe, 2017 ). Jess assesses essays by deducting points from the perfect score, utilizing the Mainichi Daily News newspaper as a database. The evaluation criteria employed by Jess encompass various aspects, such as rhetorical elements (e.g., reading comprehension, vocabulary diversity, percentage of complex words, and percentage of passive sentences), organizational structures (e.g., forward and reverse connection structures), and content analysis (e.g., latent semantic indexing). JWriter employs linear regression analysis to assign weights to various measurement indices, such as average sentence length and total number of characters. These weights are then combined to derive the overall score. A pilot study involving the Jess model was conducted on 1320 essays at different proficiency levels, including primary, intermediate, and advanced. However, the results indicated that the Jess model failed to significantly distinguish between these essay levels. Out of the 16 measures used, four measures, namely median sentence length, median clause length, median number of phrases, and maximum number of phrases, did not show statistically significant differences between the levels. Additionally, two measures exhibited between-level differences but lacked linear progression: the number of attributives declined words and the Kanji/kana ratio. On the other hand, the remaining measures, including maximum sentence length, maximum clause length, number of attributive conjugated words, maximum number of consecutive infinitive forms, maximum number of conjunctive-particle clauses, k characteristic value, percentage of big words, and percentage of passive sentences, demonstrated statistically significant between-level differences and displayed linear progression.

Both Jess and JWriter exhibit notable limitations, including the manual selection of feature parameters and weights, which can introduce biases into the scoring process. The reliance on human annotators to label non-native language essays also introduces potential noise and variability in the scoring. Furthermore, an important concern is the possibility of system manipulation and cheating by learners who are aware of the regression equation utilized by the models (Hirao et al. 2020 ). These limitations emphasize the need for further advancements in AES systems to address these challenges.

Deep learning technology in AES

Deep learning has emerged as one of the approaches for improving the accuracy and effectiveness of AES. Deep learning-based AES methods utilize artificial neural networks that mimic the human brain’s functioning through layered algorithms and computational units. Unlike conventional machine learning, deep learning autonomously learns from the environment and past errors without human intervention. This enables deep learning models to establish nonlinear correlations, resulting in higher accuracy. Recent advancements in deep learning have led to the development of transformers, which are particularly effective in learning text representations. Noteworthy examples include bidirectional encoder representations from transformers (BERT) (Devlin et al. 2019 ) and the generative pretrained transformer (GPT) (OpenAI).

BERT is a linguistic representation model that utilizes a transformer architecture and is trained on two tasks: masked linguistic modeling and next-sentence prediction (Hirao et al. 2020 ; Vaswani et al. 2017 ). In the context of AES, BERT follows specific procedures, as illustrated in Fig. 1 : (a) the tokenized prompts and essays are taken as input; (b) special tokens, such as [CLS] and [SEP], are added to mark the beginning and separation of prompts and essays; (c) the transformer encoder processes the prompt and essay sequences, resulting in hidden layer sequences; (d) the hidden layers corresponding to the [CLS] tokens (T[CLS]) represent distributed representations of the prompts and essays; and (e) a multilayer perceptron uses these distributed representations as input to obtain the final score (Hirao et al. 2020 ).

figure 1

AES system with BERT (Hirao et al. 2020 ).

The training of BERT using a substantial amount of sentence data through the Masked Language Model (MLM) allows it to capture contextual information within the hidden layers. Consequently, BERT is expected to be capable of identifying artificial essays as invalid and assigning them lower scores (Mizumoto and Eguchi, 2023 ). In the context of AES for nonnative Japanese learners, Hirao et al. ( 2020 ) combined the long short-term memory (LSTM) model proposed by Hochreiter and Schmidhuber ( 1997 ) with BERT to develop a tailored automated Essay Scoring System. The findings of their study revealed that the BERT model outperformed both the conventional machine learning approach utilizing character-type features such as “kanji” and “hiragana”, as well as the standalone LSTM model. Takeuchi et al. ( 2021 ) presented an approach to Japanese AES that eliminates the requirement for pre-scored essays by relying solely on reference texts or a model answer for the essay task. They investigated multiple similarity evaluation methods, including frequency of morphemes, idf values calculated on Wikipedia, LSI, LDA, word-embedding vectors, and document vectors produced by BERT. The experimental findings revealed that the method utilizing the frequency of morphemes with idf values exhibited the strongest correlation with human-annotated scores across different essay tasks. The utilization of BERT in AES encounters several limitations. Firstly, essays often exceed the model’s maximum length limit. Second, only score labels are available for training, which restricts access to additional information.

Mizumoto and Eguchi ( 2023 ) were pioneers in employing the GPT model for AES in non-native English writing. Their study focused on evaluating the accuracy and reliability of AES using the GPT-3 text-davinci-003 model, analyzing a dataset of 12,100 essays from the corpus of nonnative written English (TOEFL11). The findings indicated that AES utilizing the GPT-3 model exhibited a certain degree of accuracy and reliability. They suggest that GPT-3-based AES systems hold the potential to provide support for human ratings. However, applying GPT model to AES presents a unique natural language processing (NLP) task that involves considerations such as nonnative language proficiency, the influence of the learner’s first language on the output in the target language, and identifying linguistic features that best indicate writing quality in a specific language. These linguistic features may differ morphologically or syntactically from those present in the learners’ first language, as observed in (1)–(3).

我-送了-他-一本-书

Wǒ-sòngle-tā-yī běn-shū

1 sg .-give. past- him-one .cl- book

“I gave him a book.”

Agglutinative

彼-に-本-を-あげ-まし-た

Kare-ni-hon-o-age-mashi-ta

3 sg .- dat -hon- acc- give.honorification. past

Inflectional

give, give-s, gave, given, giving

Additionally, the morphological agglutination and subject-object-verb (SOV) order in Japanese, along with its idiomatic expressions, pose additional challenges for applying language models in AES tasks (4).

足-が 棒-に なり-ました

Ashi-ga bo-ni nar-mashita

leg- nom stick- dat become- past

“My leg became like a stick (I am extremely tired).”

The example sentence provided demonstrates the morpho-syntactic structure of Japanese and the presence of an idiomatic expression. In this sentence, the verb “なる” (naru), meaning “to become”, appears at the end of the sentence. The verb stem “なり” (nari) is attached with morphemes indicating honorification (“ます” - mashu) and tense (“た” - ta), showcasing agglutination. While the sentence can be literally translated as “my leg became like a stick”, it carries an idiomatic interpretation that implies “I am extremely tired”.

To overcome this issue, CyberAgent Inc. ( 2023 ) has developed the Open-Calm series of language models specifically designed for Japanese. Open-Calm consists of pre-trained models available in various sizes, such as Small, Medium, Large, and 7b. Figure 2 depicts the fundamental structure of the Open-Calm model. A key feature of this architecture is the incorporation of the Lora Adapter and GPT-NeoX frameworks, which can enhance its language processing capabilities.

figure 2

GPT-NeoX Model Architecture (Okgetheng and Takeuchi 2024 ).

In a recent study conducted by Okgetheng and Takeuchi ( 2024 ), they assessed the efficacy of Open-Calm language models in grading Japanese essays. The research utilized a dataset of approximately 300 essays, which were annotated by native Japanese educators. The findings of the study demonstrate the considerable potential of Open-Calm language models in automated Japanese essay scoring. Specifically, among the Open-Calm family, the Open-Calm Large model (referred to as OCLL) exhibited the highest performance. However, it is important to note that, as of the current date, the Open-Calm Large model does not offer public access to its server. Consequently, users are required to independently deploy and operate the environment for OCLL. In order to utilize OCLL, users must have a PC equipped with an NVIDIA GeForce RTX 3060 (8 or 12 GB VRAM).

In summary, while the potential of LLMs in automated scoring of nonnative Japanese essays has been demonstrated in two studies—BERT-driven AES (Hirao et al. 2020 ) and OCLL-based AES (Okgetheng and Takeuchi, 2024 )—the number of research efforts in this area remains limited.

Another significant challenge in applying LLMs to AES lies in prompt engineering and ensuring its reliability and effectiveness (Brown et al. 2020 ; Rae et al. 2021 ; Zhang et al. 2021 ). Various prompting strategies have been proposed, such as the zero-shot chain of thought (CoT) approach (Kojima et al. 2022 ), which involves manually crafting diverse and effective examples. However, manual efforts can lead to mistakes. To address this, Zhang et al. ( 2021 ) introduced an automatic CoT prompting method called Auto-CoT, which demonstrates matching or superior performance compared to the CoT paradigm. Another prompt framework is trees of thoughts, enabling a model to self-evaluate its progress at intermediate stages of problem-solving through deliberate reasoning (Yao et al. 2023 ).

Beyond linguistic studies, there has been a noticeable increase in the number of foreign workers in Japan and Japanese learners worldwide (Ministry of Health, Labor, and Welfare of Japan, 2022 ; Japan Foundation, 2021 ). However, existing assessment methods, such as the Japanese Language Proficiency Test (JLPT), J-CAT, and TTBJ Footnote 1 , primarily focus on reading, listening, vocabulary, and grammar skills, neglecting the evaluation of writing proficiency. As the number of workers and language learners continues to grow, there is a rising demand for an efficient AES system that can reduce costs and time for raters and be utilized for employment, examinations, and self-study purposes.

This study aims to explore the potential of LLM-based AES by comparing the effectiveness of five models: two LLMs (GPT Footnote 2 and BERT), one Japanese local LLM (OCLL), and two conventional machine learning-based methods (linguistic feature-based scoring tools - Jess and JWriter).

The research questions addressed in this study are as follows:

To what extent do the LLM-driven AES and linguistic feature-based AES, when used as automated tools to support human rating, accurately reflect test takers’ actual performance?

What influence does the prompt have on the accuracy and performance of LLM-based AES methods?

The subsequent sections of the manuscript cover the methodology, including the assessment measures for nonnative Japanese writing proficiency, criteria for prompts, and the dataset. The evaluation section focuses on the analysis of annotations and rating scores generated by LLM-driven and linguistic feature-based AES methods.

Methodology

The dataset utilized in this study was obtained from the International Corpus of Japanese as a Second Language (I-JAS) Footnote 3 . This corpus consisted of 1000 participants who represented 12 different first languages. For the study, the participants were given a story-writing task on a personal computer. They were required to write two stories based on the 4-panel illustrations titled “Picnic” and “The key” (see Appendix A). Background information for the participants was provided by the corpus, including their Japanese language proficiency levels assessed through two online tests: J-CAT and SPOT. These tests evaluated their reading, listening, vocabulary, and grammar abilities. The learners’ proficiency levels were categorized into six levels aligned with the Common European Framework of Reference for Languages (CEFR) and the Reference Framework for Japanese Language Education (RFJLE): A1, A2, B1, B2, C1, and C2. According to Lee et al. ( 2015 ), there is a high level of agreement (r = 0.86) between the J-CAT and SPOT assessments, indicating that the proficiency certifications provided by J-CAT are consistent with those of SPOT. However, it is important to note that the scores of J-CAT and SPOT do not have a one-to-one correspondence. In this study, the J-CAT scores were used as a benchmark to differentiate learners of different proficiency levels. A total of 1400 essays were utilized, representing the beginner (aligned with A1), A2, B1, B2, C1, and C2 levels based on the J-CAT scores. Table 1 provides information about the learners’ proficiency levels and their corresponding J-CAT and SPOT scores.

A dataset comprising a total of 1400 essays from the story writing tasks was collected. Among these, 714 essays were utilized to evaluate the reliability of the LLM-based AES method, while the remaining 686 essays were designated as development data to assess the LLM-based AES’s capability to distinguish participants with varying proficiency levels. The GPT 4 API was used in this study. A detailed explanation of the prompt-assessment criteria is provided in Section Prompt . All essays were sent to the model for measurement and scoring.

Measures of writing proficiency for nonnative Japanese

Japanese exhibits a morphologically agglutinative structure where morphemes are attached to the word stem to convey grammatical functions such as tense, aspect, voice, and honorifics, e.g. (5).

食べ-させ-られ-まし-た-か

tabe-sase-rare-mashi-ta-ka

[eat (stem)-causative-passive voice-honorification-tense. past-question marker]

Japanese employs nine case particles to indicate grammatical functions: the nominative case particle が (ga), the accusative case particle を (o), the genitive case particle の (no), the dative case particle に (ni), the locative/instrumental case particle で (de), the ablative case particle から (kara), the directional case particle へ (e), and the comitative case particle と (to). The agglutinative nature of the language, combined with the case particle system, provides an efficient means of distinguishing between active and passive voice, either through morphemes or case particles, e.g. 食べる taberu “eat concusive . ” (active voice); 食べられる taberareru “eat concusive . ” (passive voice). In the active voice, “パン を 食べる” (pan o taberu) translates to “to eat bread”. On the other hand, in the passive voice, it becomes “パン が 食べられた” (pan ga taberareta), which means “(the) bread was eaten”. Additionally, it is important to note that different conjugations of the same lemma are considered as one type in order to ensure a comprehensive assessment of the language features. For example, e.g., 食べる taberu “eat concusive . ”; 食べている tabeteiru “eat progress .”; 食べた tabeta “eat past . ” as one type.

To incorporate these features, previous research (Suzuki, 1999 ; Watanabe et al. 1988 ; Ishioka, 2001 ; Ishioka and Kameda, 2006 ; Hirao et al. 2020 ) has identified complexity, fluency, and accuracy as crucial factors for evaluating writing quality. These criteria are assessed through various aspects, including lexical richness (lexical density, diversity, and sophistication), syntactic complexity, and cohesion (Kyle et al. 2021 ; Mizumoto and Eguchi, 2023 ; Ure, 1971 ; Halliday, 1985 ; Barkaoui and Hadidi, 2020 ; Zenker and Kyle, 2021 ; Kim et al. 2018 ; Lu, 2017 ; Ortega, 2015 ). Therefore, this study proposes five scoring categories: lexical richness, syntactic complexity, cohesion, content elaboration, and grammatical accuracy. A total of 16 measures were employed to capture these categories. The calculation process and specific details of these measures can be found in Table 2 .

T-unit, first introduced by Hunt ( 1966 ), is a measure used for evaluating speech and composition. It serves as an indicator of syntactic development and represents the shortest units into which a piece of discourse can be divided without leaving any sentence fragments. In the context of Japanese language assessment, Sakoda and Hosoi ( 2020 ) utilized T-unit as the basic unit to assess the accuracy and complexity of Japanese learners’ speaking and storytelling. The calculation of T-units in Japanese follows the following principles:

A single main clause constitutes 1 T-unit, regardless of the presence or absence of dependent clauses, e.g. (6).

ケンとマリはピクニックに行きました (main clause): 1 T-unit.

If a sentence contains a main clause along with subclauses, each subclause is considered part of the same T-unit, e.g. (7).

天気が良かった の で (subclause)、ケンとマリはピクニックに行きました (main clause): 1 T-unit.

In the case of coordinate clauses, where multiple clauses are connected, each coordinated clause is counted separately. Thus, a sentence with coordinate clauses may have 2 T-units or more, e.g. (8).

ケンは地図で場所を探して (coordinate clause)、マリはサンドイッチを作りました (coordinate clause): 2 T-units.

Lexical diversity refers to the range of words used within a text (Engber, 1995 ; Kyle et al. 2021 ) and is considered a useful measure of the breadth of vocabulary in L n production (Jarvis, 2013a , 2013b ).

The type/token ratio (TTR) is widely recognized as a straightforward measure for calculating lexical diversity and has been employed in numerous studies. These studies have demonstrated a strong correlation between TTR and other methods of measuring lexical diversity (e.g., Bentz et al. 2016 ; Čech and Miroslav, 2018 ; Çöltekin and Taraka, 2018 ). TTR is computed by considering both the number of unique words (types) and the total number of words (tokens) in a given text. Given that the length of learners’ writing texts can vary, this study employs the moving average type-token ratio (MATTR) to mitigate the influence of text length. MATTR is calculated using a 50-word moving window. Initially, a TTR is determined for words 1–50 in an essay, followed by words 2–51, 3–52, and so on until the end of the essay is reached (Díez-Ortega and Kyle, 2023 ). The final MATTR scores were obtained by averaging the TTR scores for all 50-word windows. The following formula was employed to derive MATTR:

\({\rm{MATTR}}({\rm{W}})=\frac{{\sum }_{{\rm{i}}=1}^{{\rm{N}}-{\rm{W}}+1}{{\rm{F}}}_{{\rm{i}}}}{{\rm{W}}({\rm{N}}-{\rm{W}}+1)}\)

Here, N refers to the number of tokens in the corpus. W is the randomly selected token size (W < N). \({F}_{i}\) is the number of types in each window. The \({\rm{MATTR}}({\rm{W}})\) is the mean of a series of type-token ratios (TTRs) based on the word form for all windows. It is expected that individuals with higher language proficiency will produce texts with greater lexical diversity, as indicated by higher MATTR scores.

Lexical density was captured by the ratio of the number of lexical words to the total number of words (Lu, 2012 ). Lexical sophistication refers to the utilization of advanced vocabulary, often evaluated through word frequency indices (Crossley et al. 2013 ; Haberman, 2008 ; Kyle and Crossley, 2015 ; Laufer and Nation, 1995 ; Lu, 2012 ; Read, 2000 ). In line of writing, lexical sophistication can be interpreted as vocabulary breadth, which entails the appropriate usage of vocabulary items across various lexicon-grammatical contexts and registers (Garner et al. 2019 ; Kim et al. 2018 ; Kyle et al. 2018 ). In Japanese specifically, words are considered lexically sophisticated if they are not included in the “Japanese Education Vocabulary List Ver 1.0”. Footnote 4 Consequently, lexical sophistication was calculated by determining the number of sophisticated word types relative to the total number of words per essay. Furthermore, it has been suggested that, in Japanese writing, sentences should ideally have a length of no more than 40 to 50 characters, as this promotes readability. Therefore, the median and maximum sentence length can be considered as useful indices for assessment (Ishioka and Kameda, 2006 ).

Syntactic complexity was assessed based on several measures, including the mean length of clauses, verb phrases per T-unit, clauses per T-unit, dependent clauses per T-unit, complex nominals per clause, adverbial clauses per clause, coordinate phrases per clause, and mean dependency distance (MDD). The MDD reflects the distance between the governor and dependent positions in a sentence. A larger dependency distance indicates a higher cognitive load and greater complexity in syntactic processing (Liu, 2008 ; Liu et al. 2017 ). The MDD has been established as an efficient metric for measuring syntactic complexity (Jiang, Quyang, and Liu, 2019 ; Li and Yan, 2021 ). To calculate the MDD, the position numbers of the governor and dependent are subtracted, assuming that words in a sentence are assigned in a linear order, such as W1 … Wi … Wn. In any dependency relationship between words Wa and Wb, Wa is the governor and Wb is the dependent. The MDD of the entire sentence was obtained by taking the absolute value of governor – dependent:

MDD = \(\frac{1}{n}{\sum }_{i=1}^{n}|{\rm{D}}{{\rm{D}}}_{i}|\)

In this formula, \(n\) represents the number of words in the sentence, and \({DD}i\) is the dependency distance of the \({i}^{{th}}\) dependency relationship of a sentence. Building on this, the annotation of sentence ‘Mary-ga-John-ni-keshigomu-o-watashita was [Mary- top -John- dat -eraser- acc -give- past] ’. The sentence’s MDD would be 2. Table 3 provides the CSV file as a prompt for GPT 4.

Cohesion (semantic similarity) and content elaboration aim to capture the ideas presented in test taker’s essays. Cohesion was assessed using three measures: Synonym overlap/paragraph (topic), Synonym overlap/paragraph (keywords), and word2vec cosine similarity. Content elaboration and development were measured as the number of metadiscourse markers (type)/number of words. To capture content closely, this study proposed a novel-distance based representation, by encoding the cosine distance between the essay (by learner) and essay task’s (topic and keyword) i -vectors. The learner’s essay is decoded into a word sequence, and aligned to the essay task’ topic and keyword for log-likelihood measurement. The cosine distance reveals the content elaboration score in the leaners’ essay. The mathematical equation of cosine similarity between target-reference vectors is shown in (11), assuming there are i essays and ( L i , …. L n ) and ( N i , …. N n ) are the vectors representing the learner and task’s topic and keyword respectively. The content elaboration distance between L i and N i was calculated as follows:

\(\cos \left(\theta \right)=\frac{{\rm{L}}\,\cdot\, {\rm{N}}}{\left|{\rm{L}}\right|{\rm{|N|}}}=\frac{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}{N}_{i}}{\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}^{2}}\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{N}_{i}^{2}}}\)

A high similarity value indicates a low difference between the two recognition outcomes, which in turn suggests a high level of proficiency in content elaboration.

To evaluate the effectiveness of the proposed measures in distinguishing different proficiency levels among nonnative Japanese speakers’ writing, we conducted a multi-faceted Rasch measurement analysis (Linacre, 1994 ). This approach applies measurement models to thoroughly analyze various factors that can influence test outcomes, including test takers’ proficiency, item difficulty, and rater severity, among others. The underlying principles and functionality of multi-faceted Rasch measurement are illustrated in (12).

\(\log \left(\frac{{P}_{{nijk}}}{{P}_{{nij}(k-1)}}\right)={B}_{n}-{D}_{i}-{C}_{j}-{F}_{k}\)

(12) defines the logarithmic transformation of the probability ratio ( P nijk /P nij(k-1) )) as a function of multiple parameters. Here, n represents the test taker, i denotes a writing proficiency measure, j corresponds to the human rater, and k represents the proficiency score. The parameter B n signifies the proficiency level of test taker n (where n ranges from 1 to N). D j represents the difficulty parameter of test item i (where i ranges from 1 to L), while C j represents the severity of rater j (where j ranges from 1 to J). Additionally, F k represents the step difficulty for a test taker to move from score ‘k-1’ to k . P nijk refers to the probability of rater j assigning score k to test taker n for test item i . P nij(k-1) represents the likelihood of test taker n being assigned score ‘k-1’ by rater j for test item i . Each facet within the test is treated as an independent parameter and estimated within the same reference framework. To evaluate the consistency of scores obtained through both human and computer analysis, we utilized the Infit mean-square statistic. This statistic is a chi-square measure divided by the degrees of freedom and is weighted with information. It demonstrates higher sensitivity to unexpected patterns in responses to items near a person’s proficiency level (Linacre, 2002 ). Fit statistics are assessed based on predefined thresholds for acceptable fit. For the Infit MNSQ, which has a mean of 1.00, different thresholds have been suggested. Some propose stricter thresholds ranging from 0.7 to 1.3 (Bond et al. 2021 ), while others suggest more lenient thresholds ranging from 0.5 to 1.5 (Eckes, 2009 ). In this study, we adopted the criterion of 0.70–1.30 for the Infit MNSQ.

Moving forward, we can now proceed to assess the effectiveness of the 16 proposed measures based on five criteria for accurately distinguishing various levels of writing proficiency among non-native Japanese speakers. To conduct this evaluation, we utilized the development dataset from the I-JAS corpus, as described in Section Dataset . Table 4 provides a measurement report that presents the performance details of the 14 metrics under consideration. The measure separation was found to be 4.02, indicating a clear differentiation among the measures. The reliability index for the measure separation was 0.891, suggesting consistency in the measurement. Similarly, the person separation reliability index was 0.802, indicating the accuracy of the assessment in distinguishing between individuals. All 16 measures demonstrated Infit mean squares within a reasonable range, ranging from 0.76 to 1.28. The Synonym overlap/paragraph (topic) measure exhibited a relatively high outfit mean square of 1.46, although the Infit mean square falls within an acceptable range. The standard error for the measures ranged from 0.13 to 0.28, indicating the precision of the estimates.

Table 5 further illustrated the weights assigned to different linguistic measures for score prediction, with higher weights indicating stronger correlations between those measures and higher scores. Specifically, the following measures exhibited higher weights compared to others: moving average type token ratio per essay has a weight of 0.0391. Mean dependency distance had a weight of 0.0388. Mean length of clause, calculated by dividing the number of words by the number of clauses, had a weight of 0.0374. Complex nominals per T-unit, calculated by dividing the number of complex nominals by the number of T-units, had a weight of 0.0379. Coordinate phrases rate, calculated by dividing the number of coordinate phrases by the number of clauses, had a weight of 0.0325. Grammatical error rate, representing the number of errors per essay, had a weight of 0.0322.

Criteria (output indicator)

The criteria used to evaluate the writing ability in this study were based on CEFR, which follows a six-point scale ranging from A1 to C2. To assess the quality of Japanese writing, the scoring criteria from Table 6 were utilized. These criteria were derived from the IELTS writing standards and served as assessment guidelines and prompts for the written output.

A prompt is a question or detailed instruction that is provided to the model to obtain a proper response. After several pilot experiments, we decided to provide the measures (Section Measures of writing proficiency for nonnative Japanese ) as the input prompt and use the criteria (Section Criteria (output indicator) ) as the output indicator. Regarding the prompt language, considering that the LLM was tasked with rating Japanese essays, would prompt in Japanese works better Footnote 5 ? We conducted experiments comparing the performance of GPT-4 using both English and Japanese prompts. Additionally, we utilized the Japanese local model OCLL with Japanese prompts. Multiple trials were conducted using the same sample. Regardless of the prompt language used, we consistently obtained the same grading results with GPT-4, which assigned a grade of B1 to the writing sample. This suggested that GPT-4 is reliable and capable of producing consistent ratings regardless of the prompt language. On the other hand, when we used Japanese prompts with the Japanese local model “OCLL”, we encountered inconsistent grading results. Out of 10 attempts with OCLL, only 6 yielded consistent grading results (B1), while the remaining 4 showed different outcomes, including A1 and B2 grades. These findings indicated that the language of the prompt was not the determining factor for reliable AES. Instead, the size of the training data and the model parameters played crucial roles in achieving consistent and reliable AES results for the language model.

The following is the utilized prompt, which details all measures and requires the LLM to score the essays using holistic and trait scores.

Please evaluate Japanese essays written by Japanese learners and assign a score to each essay on a six-point scale, ranging from A1, A2, B1, B2, C1 to C2. Additionally, please provide trait scores and display the calculation process for each trait score. The scoring should be based on the following criteria:

Moving average type-token ratio.

Number of lexical words (token) divided by the total number of words per essay.

Number of sophisticated word types divided by the total number of words per essay.

Mean length of clause.

Verb phrases per T-unit.

Clauses per T-unit.

Dependent clauses per T-unit.

Complex nominals per clause.

Adverbial clauses per clause.

Coordinate phrases per clause.

Mean dependency distance.

Synonym overlap paragraph (topic and keywords).

Word2vec cosine similarity.

Connectives per essay.

Conjunctions per essay.

Number of metadiscourse markers (types) divided by the total number of words.

Number of errors per essay.

Japanese essay text

出かける前に二人が地図を見ている間に、サンドイッチを入れたバスケットに犬が入ってしまいました。それに気づかずに二人は楽しそうに出かけて行きました。やがて突然犬がバスケットから飛び出し、二人は驚きました。バスケット の 中を見ると、食べ物はすべて犬に食べられていて、二人は困ってしまいました。(ID_JJJ01_SW1)

The score of the example above was B1. Figure 3 provides an example of holistic and trait scores provided by GPT-4 (with a prompt indicating all measures) via Bing Footnote 6 .

figure 3

Example of GPT-4 AES and feedback (with a prompt indicating all measures).

Statistical analysis

The aim of this study is to investigate the potential use of LLM for nonnative Japanese AES. It seeks to compare the scoring outcomes obtained from feature-based AES tools, which rely on conventional machine learning technology (i.e. Jess, JWriter), with those generated by AI-driven AES tools utilizing deep learning technology (BERT, GPT, OCLL). To assess the reliability of a computer-assisted annotation tool, the study initially established human-human agreement as the benchmark measure. Subsequently, the performance of the LLM-based method was evaluated by comparing it to human-human agreement.

To assess annotation agreement, the study employed standard measures such as precision, recall, and F-score (Brants 2000 ; Lu 2010 ), along with the quadratically weighted kappa (QWK) to evaluate the consistency and agreement in the annotation process. Assume A and B represent human annotators. When comparing the annotations of the two annotators, the following results are obtained. The evaluation of precision, recall, and F-score metrics was illustrated in equations (13) to (15).

\({\rm{Recall}}(A,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,A}\)

\({\rm{Precision}}(A,\,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,B}\)

The F-score is the harmonic mean of recall and precision:

\({\rm{F}}-{\rm{score}}=\frac{2* ({\rm{Precision}}* {\rm{Recall}})}{{\rm{Precision}}+{\rm{Recall}}}\)

The highest possible value of an F-score is 1.0, indicating perfect precision and recall, and the lowest possible value is 0, if either precision or recall are zero.

In accordance with Taghipour and Ng ( 2016 ), the calculation of QWK involves two steps:

Step 1: Construct a weight matrix W as follows:

\({W}_{{ij}}=\frac{{(i-j)}^{2}}{{(N-1)}^{2}}\)

i represents the annotation made by the tool, while j represents the annotation made by a human rater. N denotes the total number of possible annotations. Matrix O is subsequently computed, where O_( i, j ) represents the count of data annotated by the tool ( i ) and the human annotator ( j ). On the other hand, E refers to the expected count matrix, which undergoes normalization to ensure that the sum of elements in E matches the sum of elements in O.

Step 2: With matrices O and E, the QWK is obtained as follows:

K = 1- \(\frac{\sum i,j{W}_{i,j}\,{O}_{i,j}}{\sum i,j{W}_{i,j}\,{E}_{i,j}}\)

The value of the quadratic weighted kappa increases as the level of agreement improves. Further, to assess the accuracy of LLM scoring, the proportional reductive mean square error (PRMSE) was employed. The PRMSE approach takes into account the variability observed in human ratings to estimate the rater error, which is then subtracted from the variance of the human labels. This calculation provides an overall measure of agreement between the automated scores and true scores (Haberman et al. 2015 ; Loukina et al. 2020 ; Taghipour and Ng, 2016 ). The computation of PRMSE involves the following steps:

Step 1: Calculate the mean squared errors (MSEs) for the scoring outcomes of the computer-assisted tool (MSE tool) and the human scoring outcomes (MSE human).

Step 2: Determine the PRMSE by comparing the MSE of the computer-assisted tool (MSE tool) with the MSE from human raters (MSE human), using the following formula:

\({\rm{PRMSE}}=1-\frac{({\rm{MSE}}\,{\rm{tool}})\,}{({\rm{MSE}}\,{\rm{human}})\,}=1-\,\frac{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-{\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-\hat{{\rm{y}}})}^{2}}\)

In the numerator, ŷi represents the scoring outcome predicted by a specific LLM-driven AES system for a given sample. The term y i − ŷ i represents the difference between this predicted outcome and the mean value of all LLM-driven AES systems’ scoring outcomes. It quantifies the deviation of the specific LLM-driven AES system’s prediction from the average prediction of all LLM-driven AES systems. In the denominator, y i − ŷ represents the difference between the scoring outcome provided by a specific human rater for a given sample and the mean value of all human raters’ scoring outcomes. It measures the discrepancy between the specific human rater’s score and the average score given by all human raters. The PRMSE is then calculated by subtracting the ratio of the MSE tool to the MSE human from 1. PRMSE falls within the range of 0 to 1, with larger values indicating reduced errors in LLM’s scoring compared to those of human raters. In other words, a higher PRMSE implies that LLM’s scoring demonstrates greater accuracy in predicting the true scores (Loukina et al. 2020 ). The interpretation of kappa values, ranging from 0 to 1, is based on the work of Landis and Koch ( 1977 ). Specifically, the following categories are assigned to different ranges of kappa values: −1 indicates complete inconsistency, 0 indicates random agreement, 0.0 ~ 0.20 indicates extremely low level of agreement (slight), 0.21 ~ 0.40 indicates moderate level of agreement (fair), 0.41 ~ 0.60 indicates medium level of agreement (moderate), 0.61 ~ 0.80 indicates high level of agreement (substantial), 0.81 ~ 1 indicates almost perfect level of agreement. All statistical analyses were executed using Python script.

Results and discussion

Annotation reliability of the llm.

This section focuses on assessing the reliability of the LLM’s annotation and scoring capabilities. To evaluate the reliability, several tests were conducted simultaneously, aiming to achieve the following objectives:

Assess the LLM’s ability to differentiate between test takers with varying levels of oral proficiency.

Determine the level of agreement between the annotations and scoring performed by the LLM and those done by human raters.

The evaluation of the results encompassed several metrics, including: precision, recall, F-Score, quadratically-weighted kappa, proportional reduction of mean squared error, Pearson correlation, and multi-faceted Rasch measurement.

Inter-annotator agreement (human–human annotator agreement)

We started with an agreement test of the two human annotators. Two trained annotators were recruited to determine the writing task data measures. A total of 714 scripts, as the test data, was utilized. Each analysis lasted 300–360 min. Inter-annotator agreement was evaluated using the standard measures of precision, recall, and F-score and QWK. Table 7 presents the inter-annotator agreement for the various indicators. As shown, the inter-annotator agreement was fairly high, with F-scores ranging from 1.0 for sentence and word number to 0.666 for grammatical errors.

The findings from the QWK analysis provided further confirmation of the inter-annotator agreement. The QWK values covered a range from 0.950 ( p  = 0.000) for sentence and word number to 0.695 for synonym overlap number (keyword) and grammatical errors ( p  = 0.001).

Agreement of annotation outcomes between human and LLM

To evaluate the consistency between human annotators and LLM annotators (BERT, GPT, OCLL) across the indices, the same test was conducted. The results of the inter-annotator agreement (F-score) between LLM and human annotation are provided in Appendix B-D. The F-scores ranged from 0.706 for Grammatical error # for OCLL-human to a perfect 1.000 for GPT-human, for sentences, clauses, T-units, and words. These findings were further supported by the QWK analysis, which showed agreement levels ranging from 0.807 ( p  = 0.001) for metadiscourse markers for OCLL-human to 0.962 for words ( p  = 0.000) for GPT-human. The findings demonstrated that the LLM annotation achieved a significant level of accuracy in identifying measurement units and counts.

Reliability of LLM-driven AES’s scoring and discriminating proficiency levels

This section examines the reliability of the LLM-driven AES scoring through a comparison of the scoring outcomes produced by human raters and the LLM ( Reliability of LLM-driven AES scoring ). It also assesses the effectiveness of the LLM-based AES system in differentiating participants with varying proficiency levels ( Reliability of LLM-driven AES discriminating proficiency levels ).

Reliability of LLM-driven AES scoring

Table 8 summarizes the QWK coefficient analysis between the scores computed by the human raters and the GPT-4 for the individual essays from I-JAS Footnote 7 . As shown, the QWK of all measures ranged from k  = 0.819 for lexical density (number of lexical words (tokens)/number of words per essay) to k  = 0.644 for word2vec cosine similarity. Table 9 further presents the Pearson correlations between the 16 writing proficiency measures scored by human raters and GPT 4 for the individual essays. The correlations ranged from 0.672 for syntactic complexity to 0.734 for grammatical accuracy. The correlations between the writing proficiency scores assigned by human raters and the BERT-based AES system were found to range from 0.661 for syntactic complexity to 0.713 for grammatical accuracy. The correlations between the writing proficiency scores given by human raters and the OCLL-based AES system ranged from 0.654 for cohesion to 0.721 for grammatical accuracy. These findings indicated an alignment between the assessments made by human raters and both the BERT-based and OCLL-based AES systems in terms of various aspects of writing proficiency.

Reliability of LLM-driven AES discriminating proficiency levels

After validating the reliability of the LLM’s annotation and scoring, the subsequent objective was to evaluate its ability to distinguish between various proficiency levels. For this analysis, a dataset of 686 individual essays was utilized. Table 10 presents a sample of the results, summarizing the means, standard deviations, and the outcomes of the one-way ANOVAs based on the measures assessed by the GPT-4 model. A post hoc multiple comparison test, specifically the Bonferroni test, was conducted to identify any potential differences between pairs of levels.

As the results reveal, seven measures presented linear upward or downward progress across the three proficiency levels. These were marked in bold in Table 10 and comprise one measure of lexical richness, i.e. MATTR (lexical diversity); four measures of syntactic complexity, i.e. MDD (mean dependency distance), MLC (mean length of clause), CNT (complex nominals per T-unit), CPC (coordinate phrases rate); one cohesion measure, i.e. word2vec cosine similarity and GER (grammatical error rate). Regarding the ability of the sixteen measures to distinguish adjacent proficiency levels, the Bonferroni tests indicated that statistically significant differences exist between the primary level and the intermediate level for MLC and GER. One measure of lexical richness, namely LD, along with three measures of syntactic complexity (VPT, CT, DCT, ACC), two measures of cohesion (SOPT, SOPK), and one measure of content elaboration (IMM), exhibited statistically significant differences between proficiency levels. However, these differences did not demonstrate a linear progression between adjacent proficiency levels. No significant difference was observed in lexical sophistication between proficiency levels.

To summarize, our study aimed to evaluate the reliability and differentiation capabilities of the LLM-driven AES method. For the first objective, we assessed the LLM’s ability to differentiate between test takers with varying levels of oral proficiency using precision, recall, F-Score, and quadratically-weighted kappa. Regarding the second objective, we compared the scoring outcomes generated by human raters and the LLM to determine the level of agreement. We employed quadratically-weighted kappa and Pearson correlations to compare the 16 writing proficiency measures for the individual essays. The results confirmed the feasibility of using the LLM for annotation and scoring in AES for nonnative Japanese. As a result, Research Question 1 has been addressed.

Comparison of BERT-, GPT-, OCLL-based AES, and linguistic-feature-based computation methods

This section aims to compare the effectiveness of five AES methods for nonnative Japanese writing, i.e. LLM-driven approaches utilizing BERT, GPT, and OCLL, linguistic feature-based approaches using Jess and JWriter. The comparison was conducted by comparing the ratings obtained from each approach with human ratings. All ratings were derived from the dataset introduced in Dataset . To facilitate the comparison, the agreement between the automated methods and human ratings was assessed using QWK and PRMSE. The performance of each approach was summarized in Table 11 .

The QWK coefficient values indicate that LLMs (GPT, BERT, OCLL) and human rating outcomes demonstrated higher agreement compared to feature-based AES methods (Jess and JWriter) in assessing writing proficiency criteria, including lexical richness, syntactic complexity, content, and grammatical accuracy. Among the LLMs, the GPT-4 driven AES and human rating outcomes showed the highest agreement in all criteria, except for syntactic complexity. The PRMSE values suggest that the GPT-based method outperformed linguistic feature-based methods and other LLM-based approaches. Moreover, an interesting finding emerged during the study: the agreement coefficient between GPT-4 and human scoring was even higher than the agreement between different human raters themselves. This discovery highlights the advantage of GPT-based AES over human rating. Ratings involve a series of processes, including reading the learners’ writing, evaluating the content and language, and assigning scores. Within this chain of processes, various biases can be introduced, stemming from factors such as rater biases, test design, and rating scales. These biases can impact the consistency and objectivity of human ratings. GPT-based AES may benefit from its ability to apply consistent and objective evaluation criteria. By prompting the GPT model with detailed writing scoring rubrics and linguistic features, potential biases in human ratings can be mitigated. The model follows a predefined set of guidelines and does not possess the same subjective biases that human raters may exhibit. This standardization in the evaluation process contributes to the higher agreement observed between GPT-4 and human scoring. Section Prompt strategy of the study delves further into the role of prompts in the application of LLMs to AES. It explores how the choice and implementation of prompts can impact the performance and reliability of LLM-based AES methods. Furthermore, it is important to acknowledge the strengths of the local model, i.e. the Japanese local model OCLL, which excels in processing certain idiomatic expressions. Nevertheless, our analysis indicated that GPT-4 surpasses local models in AES. This superior performance can be attributed to the larger parameter size of GPT-4, estimated to be between 500 billion and 1 trillion, which exceeds the sizes of both BERT and the local model OCLL.

Prompt strategy

In the context of prompt strategy, Mizumoto and Eguchi ( 2023 ) conducted a study where they applied the GPT-3 model to automatically score English essays in the TOEFL test. They found that the accuracy of the GPT model alone was moderate to fair. However, when they incorporated linguistic measures such as cohesion, syntactic complexity, and lexical features alongside the GPT model, the accuracy significantly improved. This highlights the importance of prompt engineering and providing the model with specific instructions to enhance its performance. In this study, a similar approach was taken to optimize the performance of LLMs. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. Model 1 was used as the baseline, representing GPT-4 without any additional prompting. Model 2, on the other hand, involved GPT-4 prompted with 16 measures that included scoring criteria, efficient linguistic features for writing assessment, and detailed measurement units and calculation formulas. The remaining models (Models 3 to 18) utilized GPT-4 prompted with individual measures. The performance of these 18 different models was assessed using the output indicators described in Section Criteria (output indicator) . By comparing the performances of these models, the study aimed to understand the impact of prompt engineering on the accuracy and effectiveness of GPT-4 in AES tasks.

Based on the PRMSE scores presented in Fig. 4 , it was observed that Model 1, representing GPT-4 without any additional prompting, achieved a fair level of performance. However, Model 2, which utilized GPT-4 prompted with all measures, outperformed all other models in terms of PRMSE score, achieving a score of 0.681. These results indicate that the inclusion of specific measures and prompts significantly enhanced the performance of GPT-4 in AES. Among the measures, syntactic complexity was found to play a particularly significant role in improving the accuracy of GPT-4 in assessing writing quality. Following that, lexical diversity emerged as another important factor contributing to the model’s effectiveness. The study suggests that a well-prompted GPT-4 can serve as a valuable tool to support human assessors in evaluating writing quality. By utilizing GPT-4 as an automated scoring tool, the evaluation biases associated with human raters can be minimized. This has the potential to empower teachers by allowing them to focus on designing writing tasks and guiding writing strategies, while leveraging the capabilities of GPT-4 for efficient and reliable scoring.

figure 4

PRMSE scores of the 18 AES models.

This study aimed to investigate two main research questions: the feasibility of utilizing LLMs for AES and the impact of prompt engineering on the application of LLMs in AES.

To address the first objective, the study compared the effectiveness of five different models: GPT, BERT, the Japanese local LLM (OCLL), and two conventional machine learning-based AES tools (Jess and JWriter). The PRMSE values indicated that the GPT-4-based method outperformed other LLMs (BERT, OCLL) and linguistic feature-based computational methods (Jess and JWriter) across various writing proficiency criteria. Furthermore, the agreement coefficient between GPT-4 and human scoring surpassed the agreement among human raters themselves, highlighting the potential of using the GPT-4 tool to enhance AES by reducing biases and subjectivity, saving time, labor, and cost, and providing valuable feedback for self-study. Regarding the second goal, the role of prompt design was investigated by comparing 18 models, including a baseline model, a model prompted with all measures, and 16 models prompted with one measure at a time. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. The PRMSE scores of the models showed that GPT-4 prompted with all measures achieved the best performance, surpassing the baseline and other models.

In conclusion, this study has demonstrated the potential of LLMs in supporting human rating in assessments. By incorporating automation, we can save time and resources while reducing biases and subjectivity inherent in human rating processes. Automated language assessments offer the advantage of accessibility, providing equal opportunities and economic feasibility for individuals who lack access to traditional assessment centers or necessary resources. LLM-based language assessments provide valuable feedback and support to learners, aiding in the enhancement of their language proficiency and the achievement of their goals. This personalized feedback can cater to individual learner needs, facilitating a more tailored and effective language-learning experience.

There are three important areas that merit further exploration. First, prompt engineering requires attention to ensure optimal performance of LLM-based AES across different language types. This study revealed that GPT-4, when prompted with all measures, outperformed models prompted with fewer measures. Therefore, investigating and refining prompt strategies can enhance the effectiveness of LLMs in automated language assessments. Second, it is crucial to explore the application of LLMs in second-language assessment and learning for oral proficiency, as well as their potential in under-resourced languages. Recent advancements in self-supervised machine learning techniques have significantly improved automatic speech recognition (ASR) systems, opening up new possibilities for creating reliable ASR systems, particularly for under-resourced languages with limited data. However, challenges persist in the field of ASR. First, ASR assumes correct word pronunciation for automatic pronunciation evaluation, which proves challenging for learners in the early stages of language acquisition due to diverse accents influenced by their native languages. Accurately segmenting short words becomes problematic in such cases. Second, developing precise audio-text transcriptions for languages with non-native accented speech poses a formidable task. Last, assessing oral proficiency levels involves capturing various linguistic features, including fluency, pronunciation, accuracy, and complexity, which are not easily captured by current NLP technology.

Data availability

The dataset utilized was obtained from the International Corpus of Japanese as a Second Language (I-JAS). The data URLs: [ https://www2.ninjal.ac.jp/jll/lsaj/ihome2.html ].

J-CAT and TTBJ are two computerized adaptive tests used to assess Japanese language proficiency.

SPOT is a specific component of the TTBJ test.

J-CAT: https://www.j-cat2.org/html/ja/pages/interpret.html

SPOT: https://ttbj.cegloc.tsukuba.ac.jp/p1.html#SPOT .

The study utilized a prompt-based GPT-4 model, developed by OpenAI, which has an impressive architecture with 1.8 trillion parameters across 120 layers. GPT-4 was trained on a vast dataset of 13 trillion tokens, using two stages: initial training on internet text datasets to predict the next token, and subsequent fine-tuning through reinforcement learning from human feedback.

https://www2.ninjal.ac.jp/jll/lsaj/ihome2-en.html .

http://jhlee.sakura.ne.jp/JEV/ by Japanese Learning Dictionary Support Group 2015.

We express our sincere gratitude to the reviewer for bringing this matter to our attention.

On February 7, 2023, Microsoft began rolling out a major overhaul to Bing that included a new chatbot feature based on OpenAI’s GPT-4 (Bing.com).

Appendix E-F present the analysis results of the QWK coefficient between the scores computed by the human raters and the BERT, OCLL models.

Attali Y, Burstein J (2006) Automated essay scoring with e-rater® V.2. J. Technol., Learn. Assess., 4

Barkaoui K, Hadidi A (2020) Assessing Change in English Second Language Writing Performance (1st ed.). Routledge, New York. https://doi.org/10.4324/9781003092346

Bentz C, Tatyana R, Koplenig A, Tanja S (2016) A comparison between morphological complexity. measures: Typological data vs. language corpora. In Proceedings of the workshop on computational linguistics for linguistic complexity (CL4LC), 142–153. Osaka, Japan: The COLING 2016 Organizing Committee

Bond TG, Yan Z, Heene M (2021) Applying the Rasch model: Fundamental measurement in the human sciences (4th ed). Routledge

Brants T (2000) Inter-annotator agreement for a German newspaper corpus. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, 31 May-2 June, European Language Resources Association

Brown TB, Mann B, Ryder N, et al. (2020) Language models are few-shot learners. Advances in Neural Information Processing Systems, Online, 6–12 December, Curran Associates, Inc., Red Hook, NY

Burstein J (2003) The E-rater scoring engine: Automated essay scoring with natural language processing. In Shermis MD and Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Čech R, Miroslav K (2018) Morphological richness of text. In Masako F, Václav C (ed) Taming the corpus: From inflection and lexis to interpretation, 63–77. Cham, Switzerland: Springer Nature

Çöltekin Ç, Taraka, R (2018) Exploiting Universal Dependencies treebanks for measuring morphosyntactic complexity. In Aleksandrs B, Christian B (ed), Proceedings of first workshop on measuring language complexity, 1–7. Torun, Poland

Crossley SA, Cobb T, McNamara DS (2013) Comparing count-based and band-based indices of word frequency: Implications for active vocabulary research and pedagogical applications. System 41:965–981. https://doi.org/10.1016/j.system.2013.08.002

Article   Google Scholar  

Crossley SA, McNamara DS (2016) Say more and be more coherent: How text elaboration and cohesion can increase writing quality. J. Writ. Res. 7:351–370

CyberAgent Inc (2023) Open-Calm series of Japanese language models. Retrieved from: https://www.cyberagent.co.jp/news/detail/id=28817

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, Minnesota, 2–7 June, pp. 4171–4186. Association for Computational Linguistics

Diez-Ortega M, Kyle K (2023) Measuring the development of lexical richness of L2 Spanish: a longitudinal learner corpus study. Studies in Second Language Acquisition 1-31

Eckes T (2009) On common ground? How raters perceive scoring criteria in oral proficiency testing. In Brown A, Hill K (ed) Language testing and evaluation 13: Tasks and criteria in performance assessment (pp. 43–73). Peter Lang Publishing

Elliot S (2003) IntelliMetric: from here to validity. In: Shermis MD, Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Google Scholar  

Engber CA (1995) The relationship of lexical proficiency to the quality of ESL compositions. J. Second Lang. Writ. 4:139–155

Garner J, Crossley SA, Kyle K (2019) N-gram measures and L2 writing proficiency. System 80:176–187. https://doi.org/10.1016/j.system.2018.12.001

Haberman SJ (2008) When can subscores have value? J. Educat. Behav. Stat., 33:204–229

Haberman SJ, Yao L, Sinharay S (2015) Prediction of true test scores from observed item scores and ancillary data. Brit. J. Math. Stat. Psychol. 68:363–385

Halliday MAK (1985) Spoken and Written Language. Deakin University Press, Melbourne, Australia

Hirao R, Arai M, Shimanaka H et al. (2020) Automated essay scoring system for nonnative Japanese learners. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pp. 1250–1257. European Language Resources Association

Hunt KW (1966) Recent Measures in Syntactic Development. Elementary English, 43(7), 732–739. http://www.jstor.org/stable/41386067

Ishioka T (2001) About e-rater, a computer-based automatic scoring system for essays [Konpyūta ni yoru essei no jidō saiten shisutemu e − rater ni tsuite]. University Entrance Examination. Forum [Daigaku nyūshi fōramu] 24:71–76

Hochreiter S, Schmidhuber J (1997) Long short- term memory. Neural Comput. 9(8):1735–1780

Article   CAS   PubMed   Google Scholar  

Ishioka T, Kameda M (2006) Automated Japanese essay scoring system based on articles written by experts. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–18 July 2006, pp. 233-240. Association for Computational Linguistics, USA

Japan Foundation (2021) Retrieved from: https://www.jpf.gp.jp/j/project/japanese/survey/result/dl/survey2021/all.pdf

Jarvis S (2013a) Defining and measuring lexical diversity. In Jarvis S, Daller M (ed) Vocabulary knowledge: Human ratings and automated measures (Vol. 47, pp. 13–44). John Benjamins. https://doi.org/10.1075/sibil.47.03ch1

Jarvis S (2013b) Capturing the diversity in lexical diversity. Lang. Learn. 63:87–106. https://doi.org/10.1111/j.1467-9922.2012.00739.x

Jiang J, Quyang J, Liu H (2019) Interlanguage: A perspective of quantitative linguistic typology. Lang. Sci. 74:85–97

Kim M, Crossley SA, Kyle K (2018) Lexical sophistication as a multidimensional phenomenon: Relations to second language lexical proficiency, development, and writing quality. Mod. Lang. J. 102(1):120–141. https://doi.org/10.1111/modl.12447

Kojima T, Gu S, Reid M et al. (2022) Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, New Orleans, LA, 29 November-1 December, Curran Associates, Inc., Red Hook, NY

Kyle K, Crossley SA (2015) Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Q 49:757–786

Kyle K, Crossley SA, Berger CM (2018) The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behav. Res. Methods 50:1030–1046. https://doi.org/10.3758/s13428-017-0924-4

Article   PubMed   Google Scholar  

Kyle K, Crossley SA, Jarvis S (2021) Assessing the validity of lexical diversity using direct judgements. Lang. Assess. Q. 18:154–170. https://doi.org/10.1080/15434303.2020.1844205

Landauer TK, Laham D, Foltz PW (2003) Automated essay scoring and annotation of essays with the Intelligent Essay Assessor. In Shermis MD, Burstein JC (ed), Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 159–174

Laufer B, Nation P (1995) Vocabulary size and use: Lexical richness in L2 written production. Appl. Linguist. 16:307–322. https://doi.org/10.1093/applin/16.3.307

Lee J, Hasebe Y (2017) jWriter Learner Text Evaluator, URL: https://jreadability.net/jwriter/

Lee J, Kobayashi N, Sakai T, Sakota K (2015) A Comparison of SPOT and J-CAT Based on Test Analysis [Tesuto bunseki ni motozuku ‘SPOT’ to ‘J-CAT’ no hikaku]. Research on the Acquisition of Second Language Japanese [Dainigengo to shite no nihongo no shūtoku kenkyū] (18) 53–69

Li W, Yan J (2021) Probability distribution of dependency distance based on a Treebank of. Japanese EFL Learners’ Interlanguage. J. Quant. Linguist. 28(2):172–186. https://doi.org/10.1080/09296174.2020.1754611

Article   MathSciNet   Google Scholar  

Linacre JM (2002) Optimizing rating scale category effectiveness. J. Appl. Meas. 3(1):85–106

PubMed   Google Scholar  

Linacre JM (1994) Constructing measurement with a Many-Facet Rasch Model. In Wilson M (ed) Objective measurement: Theory into practice, Volume 2 (pp. 129–144). Norwood, NJ: Ablex

Liu H (2008) Dependency distance as a metric of language comprehension difficulty. J. Cognitive Sci. 9:159–191

Liu H, Xu C, Liang J (2017) Dependency distance: A new perspective on syntactic patterns in natural languages. Phys. Life Rev. 21. https://doi.org/10.1016/j.plrev.2017.03.002

Loukina A, Madnani N, Cahill A, et al. (2020) Using PRMSE to evaluate automated scoring systems in the presence of label noise. Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, WA, USA → Online, 10 July, pp. 18–29. Association for Computational Linguistics

Lu X (2010) Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15:474–496

Lu X (2012) The relationship of lexical richness to the quality of ESL learners’ oral narratives. Mod. Lang. J. 96:190–208

Lu X (2017) Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Lang. Test. 34:493–511

Lu X, Hu R (2022) Sense-aware lexical sophistication indices and their relationship to second language writing quality. Behav. Res. Method. 54:1444–1460. https://doi.org/10.3758/s13428-021-01675-6

Ministry of Health, Labor, and Welfare of Japan (2022) Retrieved from: https://www.mhlw.go.jp/stf/newpage_30367.html

Mizumoto A, Eguchi M (2023) Exploring the potential of using an AI language model for automated essay scoring. Res. Methods Appl. Linguist. 3:100050

Okgetheng B, Takeuchi K (2024) Estimating Japanese Essay Grading Scores with Large Language Models. Proceedings of 30th Annual Conference of the Language Processing Society in Japan, March 2024

Ortega L (2015) Second language learning explained? SLA across 10 contemporary theories. In VanPatten B, Williams J (ed) Theories in Second Language Acquisition: An Introduction

Rae JW, Borgeaud S, Cai T, et al. (2021) Scaling Language Models: Methods, Analysis & Insights from Training Gopher. ArXiv, abs/2112.11446

Read J (2000) Assessing vocabulary. Cambridge University Press. https://doi.org/10.1017/CBO9780511732942

Rudner LM, Liang T (2002) Automated Essay Scoring Using Bayes’ Theorem. J. Technol., Learning and Assessment, 1 (2)

Sakoda K, Hosoi Y (2020) Accuracy and complexity of Japanese Language usage by SLA learners in different learning environments based on the analysis of I-JAS, a learners’ corpus of Japanese as L2. Math. Linguist. 32(7):403–418. https://doi.org/10.24701/mathling.32.7_403

Suzuki N (1999) Summary of survey results regarding comprehensive essay questions. Final report of “Joint Research on Comprehensive Examinations for the Aim of Evaluating Applicability to Each Specialized Field of Universities” for 1996-2000 [shōronbun sōgō mondai ni kansuru chōsa kekka no gaiyō. Heisei 8 - Heisei 12-nendo daigaku no kaku senmon bun’ya e no tekisei no hyōka o mokuteki to suru sōgō shiken no arikata ni kansuru kyōdō kenkyū’ saishū hōkoku-sho]. University Entrance Examination Section Center Research and Development Department [Daigaku nyūshi sentā kenkyū kaihatsubu], 21–32

Taghipour K, Ng HT (2016) A neural approach to automated essay scoring. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 1–5 November, pp. 1882–1891. Association for Computational Linguistics

Takeuchi K, Ohno M, Motojin K, Taguchi M, Inada Y, Iizuka M, Abo T, Ueda H (2021) Development of essay scoring methods based on reference texts with construction of research-available Japanese essay data. In IPSJ J 62(9):1586–1604

Ure J (1971) Lexical density: A computational technique and some findings. In Coultard M (ed) Talking about Text. English Language Research, University of Birmingham, Birmingham, England

Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need. In Advances in Neural Information Processing Systems, Long Beach, CA, 4–7 December, pp. 5998–6008, Curran Associates, Inc., Red Hook, NY

Watanabe H, Taira Y, Inoue Y (1988) Analysis of essay evaluation data [Shōronbun hyōka dēta no kaiseki]. Bulletin of the Faculty of Education, University of Tokyo [Tōkyōdaigaku kyōiku gakubu kiyō], Vol. 28, 143–164

Yao S, Yu D, Zhao J, et al. (2023) Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36

Zenker F, Kyle K (2021) Investigating minimum text lengths for lexical diversity indices. Assess. Writ. 47:100505. https://doi.org/10.1016/j.asw.2020.100505

Zhang Y, Warstadt A, Li X, et al. (2021) When do you need billions of words of pretraining data? Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, pp. 1112-1125. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.90

Download references

This research was funded by National Foundation of Social Sciences (22BYY186) to Wenchao Li.

Author information

Authors and affiliations.

Department of Japanese Studies, Zhejiang University, Hangzhou, China

Department of Linguistics and Applied Linguistics, Zhejiang University, Hangzhou, China

You can also search for this author in PubMed   Google Scholar

Contributions

Wenchao Li is in charge of conceptualization, validation, formal analysis, investigation, data curation, visualization and writing the draft. Haitao Liu is in charge of supervision.

Corresponding author

Correspondence to Wenchao Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval was not required as the study did not involve human participants.

Informed consent

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental material file #1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Li, W., Liu, H. Applying large language models for automated essay scoring for non-native Japanese. Humanit Soc Sci Commun 11 , 723 (2024). https://doi.org/10.1057/s41599-024-03209-9

Download citation

Received : 02 February 2024

Accepted : 16 May 2024

Published : 03 June 2024

DOI : https://doi.org/10.1057/s41599-024-03209-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

learning intention for essay writing

University of Notre Dame

Fresh Writing

A publication of the University Writing Program

  • Home ›
  • Essays ›

Hesburgh Library Desks: The Documented History of Students

By Ellie McCarthy

Published: June 06, 2024

3rd place McPartlin Award

Two stacks of books in a library, on the left and the right, with lights overhead

Most students at the University of Notre Dame are familiar with the fourteen-storied Hesburgh Library, a place where students and professors alike can be found completing their daily assignments or collaborating on projects. However, the first floor’s modern design fades as you make your way up each story and history fills the space. Newly painted walls turn to yellow and stained, padded spinning-chairs become solid, wooden posture correctors instead of lounge devices, and bookshelves cram the space, turning each floor into a labyrinth of knowledge.

Not only does the library age as each story gives way to the next, the seemingly untouched first floor lobby becomes a plethora of student thoughts, feelings, hopes, frustrations, stresses, politics, and jokes carved or inked into the old wooden desks. Somewhat blundering your way through the maze of books, you will eventually find collections of wooden panels joined together in a manner that I can only describe as “study cells.” 3-inch thick wooden shields separate each desk from its neighbors, which I assume had the original intention of providing privacy and focus to the student who chose to study there. However, these shields have adopted a new purpose and have been “improved upon” by generations of students and their selected mediums: sharpie, pen, pencil, marker, carving, etc. At Notre Dame, the most vandalized spaces on campus, Hesburgh Library desks, are the epitome of student connection.

While seated on a squeaky wooden chair paired with a similarly designed desk, we are subject to the influences of previous students; thoughts, quotes, and drawings forever etched into the wood leave their mark on generations of students.

Graffiti on a desk from a STEM major

 Comments such as “finals suck” or “Dear ND, Thank you for gray hair at 20 years old” provide a pathway into the minds of previous students and allow for a sense of empathy among current and future desk users. Just as researchers and archaeologists discover the patterns of life among past civilizations through artifacts, present-day students can connect with the previous generations and add their “2 cents” to the ever-evolving documentation of life at Notre Dame which is recorded in the wood of the library desks.

Students don’t only comment on their academic endeavors; they share thoughts that would otherwise be kept private. While working on my chemistry homework, I noticed a message that read “I’m in love with my best friend, but he will never know.” Reading such a personal remark made me recognize the power of leaving a message behind. The desks provide an anonymity that other forms of release cannot offer. With journals, your thoughts and statements can be tied back to you, but with graffiti, unless you sign your name, no one will know who is responsible for any comment. This form of journaling allows present students to look to the penciled-in words of past graduates as a means of validating their own feelings. In this way, the desks provide a means of connection between not only present students, but through generations of Notre Dame alumni.

The desks aren’t all marred by school stresses and unrequited love. 

a drawing of Dwight Schrute on a desk with writing around him

 Humor is ever-present in the desks’ vandalism and elicits a smile (at least internally) any time it’s noticed. On the 9th floor, a glorious line art image of Dwight Schrute blesses the eye of any student who happens to walk past it. The unmistakable ode to such a widespread source of comedy undoubtedly sparks joy or at least recognition in those lucky enough to witness it. Dwight’s portrait is surrounded by “lol” and “you win” which signifies a connection between students that you don’t get anywhere else on campus.

There are no external pressures that prevent people from commenting on something they like or dislike. In the real world, someone might refrain from complimenting someone else’s outfit due to social anxiety or the possibility of awkwardness. In an anonymous environment such as the library desks, no one has to know who is tied to a comment or a drawing, leaving them unfiltered. Students can merely express themselves without feeling limited.

The desks provide a blank canvas for students as walls, train cars, and highways do for graffiti artists. Leaving an addition to the wood allows the students to establish their own sense of permanence and an infinite, concrete tie to the university. While the Notre Dame Alumni Association spans multiple generations and promises that every alum will forever be part of the Notre Dame community, many may also relish in the comforting thought that since their words or drawings remain, they cannot be forgotten.

Unfinished lyrics to the classic “Mr. Brightside” make an appearance on 

lyrics of the song

 the 12th floor with the wish of being completed. However, “open up my eager eyes, ‘cause I’m…” is met with “Man, please stop crying.” This somewhat failed attempt at “finish the lyric” provides another example of the humor the desks carry. Being one of the university’s traditional football game songs, the student’s choice to write the lyrics of “Mr. Brightside” creates a sense of community and belonging. Most, if not all students at Notre Dame know that singing this classic tune during a football game is one of the student section’s favorite events. When reading the lyrics inked into the wooden desk, anyone with the knowledge of Notre Dame’s connection to “Mr. Brightside” may recall a game memory or experience a sense of familiarity.

Notre Dame students have also expressed optimism and well wishes to their fellow peers, 

graffiti from a well-wisher on a desk

 creating an environment of support and camaraderie. I’ve seen numerous quotes, some of my favorites being: “I wish you all the best in life: a few good pals, a couple of miracles, and a good pair ‘a socks” and “Live like you mean it. Love ‘till you feel it. - the goo goo dolls.” Whether these students were truly making an effort to positively impact someone or were merely suffering from boredom, is uncertain. However, the result is the same. These quotes elicited a smile from me as I’m sure they did many others.

While Notre Dame prides itself on its founding history and tradition, the Hesburgh library desks provide a history of Notre Dame students, the foundation of community on campus. Father Theodore Hesburgh, whom the library was named after, was committed to developing the institution into a place of intellect and faith ( An extraordinary life ) since the university had previously been known for its athletic successes. Following Father Hesburgh’s plan for the library, its mission states: “Hesburgh Libraries cultivate curiosity and discovery as a hub for intellectual life. We advance the University’s research, teaching, and learning goals while fostering Notre Dame’s engagement with the global scholarly community.” However, I would argue that the statement is missing an important aspect: fostering student collaboration and community.

There is no doubt that the old study cells have provided a means of connection between students and their peers even if that wasn’t the original intention for them. Quite contrastingly, their construction seems to encourage a separation between desk users. The desks have fulfilled the Hesburgh mission of cultivating academic excellence by granting students a quiet, secluded place to study. However, it’s important to note that they’ve done so much more. From stress and frustrations to hope and humor, the desks have allowed for a timeless community to blossom.

For that reason, I propose an amendment to the current mission statement. The addition of “fostering student collaboration and community” to the statement would further support Notre Dame’s commitment to its students, faculty, and alumni. It would also set a precedent for continued community-building within the university; especially in Hesburgh Library.

With each school year, new students filter in and out of campus, creating a dynamic community, but a Notre Dame community nonetheless. The university proudly acknowledges its alumni network for almost every major event, promotes the successes of current students, and extends a warm welcome to future Notre Dame families. By doing so, the university creates the notion that Notre Dame is a compilation of its past, present, and future.

Through somewhat unorthodox and not fully supported means, students have found a way to foster community through generations and preserve the memory of past students. In doing so, we have supported Notre Dame’s commitment to its students and promoted the pillars of community and connection.

Some might view the old desks as disgustingly vandalized property; an unfortunate misuse of a study space. In a way, they would be right. Along with the documented stresses and warm wishes, are the classic graffiti encouragements to “commit tax fraud” and “do coke”. The desks no longer appear pristine and inviting as do the new study lounges on the first and second floors. However, what they lack in appearance, they make up for with substance. Beyond the surface-level defacing of school property lies a Notre Dame history that is separate, but just as significant, as the university’s founding. I want to be clear that I am not condoning the vandalism of any space on the Notre Dame campus. I am merely expressing my appreciation and acknowledgement of the history now embedded into the Hesburgh library desks.

What is the significance of the vandalized desks? Are they art? Are they merely defaced property? They are more than both. They are our history.

Works Cited

An extraordinary life. “An Extraordinary Life.” Father Hesburgh , University of Notre Dame , 2023, hesburgh.nd.edu/fr-teds-life/an-extraordinary-life/ .

Mission. “Mission and Vision.” Hesburgh Library , University of Notre Dame , www.library.nd.edu/mission-vision#:~:text=Hesburgh%20Libraries%20cultivates%20curiosity%20and,with%20the%20global%20scholarly%20community . Accessed 12 Oct. 2023.

learning intention for essay writing

Ellie McCarthy

In her essay, “Hesburgh Library Desks: The Documented History of Students,” Ellie discusses the rhetorical significance of vandalized study spaces and offers the novel perspective that vandalism fosters student connection. As a Neuroscience and Behavior major, Ellie’s long hours spent studying in the library allowed her to explore the many layers of Hesburgh where she discovered past students’ additions to the desks. Her personal reactions to the drawings, quotes, and well-wishes inspired her to analyze the connection between Notre Dame students of the past, present, and future. Ellie aims to pursue a career in physical therapy where she hopes to study the potential impacts of mentality on the recovery process. She would like to thank her parents (ND class of 1997) for introducing her to Notre Dame and the opportunities it presents. Ellie would also like to thank Professor Jessica Thomas for inspiring her and contributing to her newfound passion for writing.

  • Share full article

Advertisement

Supported by

Charles M. Blow

The Beauty of Embracing Aging

A black and white close-up of three weathered hands.

By Charles M. Blow

Opinion Columnist

As Evelyn Couch said to Ninny Threadgoode in Fannie Flagg’s “Fried Green Tomatoes at the Whistle Stop Cafe”: “I’m too young to be old and too old to be young. I just don’t fit anywhere.”

I think about this line often, this feeling of being out of place, particularly in a culture that obsessively glorifies youth and teaches us to view aging as an enemy.

No one really tells us how we’re supposed to age, how much fighting against it and how much acceptance of it is the right balance. No one tells us how we’re supposed to feel when the body grows softer and the hair grayer, how we’re supposed to consider the creping of the skin or the wrinkles on the face that make our smiles feel unfortunate.

The poet Dylan Thomas told us we should “rage, rage against the dying of the light,” that “old age should burn and rave at close of day.” He died, sadly, before turning 40.

For those of us well past that mark, rage feels futile, like a misallocation of energy. There is, after all, a beauty in aging. And aging is about more than how we look and feel in our bodies. It’s also about how the world around us plows ahead and pulls us along.

I remember a call, a few years ago, from a longtime friend who said it looked as if her father was about to pass away. I remember meeting her, along with another friend, at her father’s elder care facility so she wouldn’t have to be alone, and seeing the way her tears fell on his face as she stroked his cheeks and cooed his name; the way she collapsed in the hallway on our way out, screaming, not knowing if that night would be his last.

He survived, and has survived several near-death experiences since, but I saw my friend’s struggle with her father’s health difficulties as a precursor to what might one day be my struggle with my parents’ aging and health challenges. And it was.

Soon after that harrowing night at the elder care facility, my mother, who lives alone, suffered a stroke. Luckily, one of my brothers was having breakfast with her that morning and, noticing that her speech was becoming slurred, rushed her to the emergency room.

On the flight to Louisiana, I tried in vain to remain calm, not knowing what condition she would be in when I arrived, not knowing the damage the stroke had done. When I finally laid eyes on her, it was confirmed for me how fortunate we were that my brother had been alert and acted quickly. My mother would fully recover, but the image of her in that hospital bed — diminished from the commanding, invincible image of her that had been burned into my mind — shook me and has remained with me.

In that moment, I was reminded that my mother was in the final chapter of her life, and that I was moving into a new phase of mine.

That is one of the profound, emotional parts of aging: assuming a new familial role. Recognizing that my brothers and I were graduating from being the uncles to being the elders.

And that shifting family dynamic exerts itself on both ends, from above and below. This year, my older son turned 30. There’s no way to continue to consider yourself young when you have a child that age. He isn’t a father yet, but it has dawned on me that by the time I was his age, I had three children and my marriage was coming to an end. In fact, by the time I was his age, all of my mother’s grandchildren had been born.

No matter how young you may look or feel, time refuses to rest. It forges on. I’m now right around the age my parents were when I first considered them old.

I’m not sure when the world will consider me old — maybe it already does — but I do know that I’m no longer afraid of it. I welcome it. And I understand that the best parts of many books are their final chapters.

The actress Jenifer Lewis, appearing on the nationally syndicated radio show “The Breakfast Club,” once remarked : “I’m 61. I got about 30 more summers left.” Since hearing those words, I’ve thought of my own life in that way, in terms of how many summers I might have left. How many more times will I see the leaves sprout and the flowers bloom? How many more times will I spend a day by the pool or enjoy an ice cream on a hot day?

I don’t consider these questions because I’m worried, but because I want to remind myself to relish. Relish every summer day. Stretch them. Fill them with memories. Smile and laugh more. Gather with friends and visit family. Put my feet in the water. Grow things and grill things. I make my summers count by making them beautiful.

I have no intention of raging against my aging. I intend to embrace it, to embrace the muscle aches and the crow’s feet as the price of growing in wisdom and grace; to understand that age is not my body forsaking me but my life rewarding me.

Aging, as I see it, is a gift, and I will receive it with gratitude.

The Times is committed to publishing a diversity of letters to the editor. We’d like to hear what you think about this or any of our articles. Here are some tips . And here’s our email: [email protected] .

Follow the New York Times Opinion section on Facebook , Instagram , TikTok , WhatsApp , X and Threads .

Charles M. Blow is an Opinion columnist for The New York Times, writing about national politics, public opinion and social justice, with a focus on racial equality and L.G.B.T.Q. rights. @ CharlesMBlow • Facebook

IMAGES

  1. Essay Challenge (Learn how to write an essay) • Teacha!

    learning intention for essay writing

  2. How To Write An Essay

    learning intention for essay writing

  3. What Is Learning? Essay about Why Learning Is Important

    learning intention for essay writing

  4. How To Write an Essay

    learning intention for essay writing

  5. Essay Writing

    learning intention for essay writing

  6. 6 Successful Persuasive Writing Strategies 2024

    learning intention for essay writing

VIDEO

  1. Travelling , As a part of education/ essay on Education tour

  2. Which one is more important: IELTS Writing Task 1 or Task 2?

  3. Importance of Writing Skills in English

  4. Opinion Essay/IELTS Writing Task 2/ IELTS Academic/ Essay Structure/ Essay Templates

  5. IELTS Writing Task 2: Opinion Essay Introduction Topic: Books in the Future

  6. ✍️ IELTS Writing Task🔊 The Importance of Handwriting Education in the Modern Era #essay #ielts

COMMENTS

  1. PDF Learning Intentions and Success Criteria in Your

    Learning intention: We are learning how to write an introduction to an essay. Success criteria: I can write an introduction that: • engages the reader with a hook • includes a statement of contention that is responsive to the essay question • succinctly summarises the key points made in the essay. Example two Learning intention: We are ...

  2. PDF Clarifying and Sharing Learning Intentions and Success Criteria

    ־ We are learning to write sentences using capital letters and full stops. Separate the Learning Intention from the Context To avoid describing the learning intention separate the intention from the context in which it is to be learned. For example: ־ Learning intention: We are learning to use colours to create an effective painting.

  3. PDF Strategies for Essay Writing

    Harvard College Writing Center 5 Asking Analytical Questions When you write an essay for a course you are taking, you are being asked not only to create a product (the essay) but, more importantly, to go through a process of thinking more deeply about a question or problem related to the course. By writing about a

  4. A Five Minute Guide To… Learning Intentions & Success Criteria

    If students can 'state', 'write', 'describe', 'explain' or 'draw', this can evidence learning. Saying that 'I know', 'I understand' or 'I am able to' doesn't evidence learning. While it might be true, it isn't evidence. Success criteria should make clear what evidence of learning needs to be produced.

  5. 7 simple steps to writing effective learning intentions and success

    Writing Learning Intentions (LI's) Clear LI's support student engagement, persistence & concentration. Learning intentions are statements that summarise the purpose of a lesson in terms of learning. They should make clear what students will be learning about, not how this learning will be achieved, that is, to the activities and tasks of ...

  6. PDF 4.2 Learning Intentions (Li) and Success Criteria (Sc)

    consequential (Section 2.1.4). And it will help everyone to maintain a learning focus (Section 3.1.2). Learning intentions describe what students should know, understand or be able to do by the end of the lesson or series of lessons. Success criteria show what students should demonstrate to show they have accomplished the learning intention.

  7. Learning Intentions and Success Criteria: A Framework for Lesson

    Learning Intentions and Success Criteria. Crafting a quality learning intention takes planning. Often, teachers will use an activity as their learning intention—but a learning intention goes beyond an activity. It focuses on the goal of the learning—the thing we want our students to know and do. The learning intention helps students stay ...

  8. PDF Learning Intentions and Success Criteria

    Learning Intentions are descriptions of what learners should know, understand and be able to do by the end of a learning period or unit. Learning intentions are the basis for tracking student progress, providing feedback and assessing achievement. In addition to learning intentions, students may also have individual learning goals that

  9. Getting Started with Essay Writing

    In this course, you'll learn all about academic essay writing and, specifically, how to write three types of essays: compare/contrast, cause/effect, and argument. To pass this course, you need to pass all four quizzes and pass all three writing assignments. When you finish one activity, you can continue to the next one.

  10. Learning Intentions

    A learning intention is a statement that summarises what a student should know, understand or be able to do at the end of a lesson or series of lessons. The purpose of a learning intention is to ensure that students understand the direction and purpose of the lesson. These statements are presented at the start of a lesson (Something we call ...

  11. Writing Goals and Intentions: 25 Prompts

    3. If you want to start your writing business this year, what areas do you need to learn more about to be successful? E.g., Setting up an LLC, learning about tax write-offs, tracking sales of your independently published books, etc. 4. If you are an independent publisher, what can you do to diversify your revenue stream this year? 5.

  12. Writing Learning Intentions and Success Criteria

    Use this lovely Writing Learning Intentions and Success Criteria resource to learn how to write effective learning intentions and success criteria. This comprehensive guide will support you in your learning and continuous professional development whether you are a probationer, a newly qualified teacher or an experienced one who needs a refresher. It also gives some great learning intentions ...

  13. PDF Learning Intentions and Success Criteria

    Learning intentions is used throughout to describe what the pupils will learn during a lesson or series of lessons. A learning intention is a statement which goes beyond naming the topic, to include detail of new knowledge. An efficient learning intention emphasises the visible positive changes in pupils' subsequent performance. You

  14. Using Bloom's Taxonomy to Write Effective Learning Outcomes

    Strive to keep all your learning outcomes measurable, clear and concise. When you are ready to write, it can be helpful to list the level of Bloom's next to the verb you choose in parentheses. For example: Course level outcome 1. (apply) Demonstrate how transportation is a critical link in the supply chain. 1.1.

  15. A Step-by-Step Plan for Teaching Narrative Writing

    Step 2: Study the Structure of a Story. Now that students have a good library of their own personal stories pulled into short-term memory, shift your focus to a more formal study of what a story looks like. Use a diagram to show students a typical story arc like the one below.

  16. PDF NSW Department of Education Writing guide

    The Writing Rope (Sedita, 2019) supports a deeper understanding of skilled writing by organising the many skills, strategies and techniques into five overarching components. These include the compositional components of critical thinking, syntax, text structure and writing craft, and the transcription skills of spelling, handwriting and ...

  17. Writing an argument

    Writing an argument - Learning intention guide. How to use this resource. This resource is designed so teachers can select the writing criteria they want to use for the focus of the assessment. After selecting the criteria, and whether to have a teacher's or a student's guide, single click the button "Generate guide" to construct an assessment ...

  18. PDF Investigating writing difficulties in essay writing: Tertiary ...

    students' and lecturers' attitude while teaching and learning Essay Writing Course,essay cognitive problems that are considered as the difficulties in the areas of writing viewpoint, transferring language, and the process ... expressing intentions, composing ideas, problem-solving, and critical thinking (Fareed et al., 2016;

  19. Motivational Teaching Strategies in Essay Writing Class: The ...

    Essay writing courses -to provide the students' awareness of the instrumental values associated with the knowledge of Essay writing courses -to increase the students' expectancy of success in particular tasks and in learning Essay writing courses -Increase the students' goal-orientedness by formulating explicit class goals accepted by them.

  20. Level 3 and 4 persuasive writing

    Level 3 and 4 persuasive writing. Unit overview. The following science unit of work uses the Teaching and Learning Cycle to help students build content science knowledge, while extending skills in writing, reading, speaking and listening. Teacher support for vocabulary development will vary according to students' current knowledge.

  21. Success Criteria For Persuasive or Speech Writing NAPLAN

    Give this checklist outlining the success criteria for writing to the children in your class so they can check their work before they hand it in. A fab way to build confidence in self-assessment and improve editing skills. The success criteria in this checklist are tailored specifically to persuasive texts. However, a lot of the techniques will be useful for children to use in all ...

  22. What Is Learning? Essay about Why Learning Is Important

    Introduction. Learning is a continuous process that involves the transformation of information and experience into abilities and knowledge. Learning, according to me, is a two way process that involves the learner and the educator leading to knowledge acquisition as well as capability. It informs my educational sector by making sure that both ...

  23. Persuasive Writing Techniques

    Australian Curriculum Links: Lesson Outline: Timeframe: 45 mins - 1 hour 30 mins. Introduction: Watch a variety of election promises from politicians and discuss their strength in persuading you (children) to believe them. At the end of the role play, give every child 3 sticky notes and list all the Persuasive Language Techniques on the board.

  24. The Insider's Guide to Essay Writing Services: What Every Student ...

    A lthough essay writing is the most common assignment type for students, the academic challenges in this area cannot be neglected. While many get stuck with the topic, others find it difficult to ...

  25. 10 Ways to Detect AI Writing

    4. Cliché Overuse. Part of the reason AI writing is so emotionless is that its cliché use is, well, on steroids. Take the skateboarding example in the previous entry. Even in the short sample, we see lines such as "the wind rushing through my hair, and the freedom to explore the world on four wheels.".

  26. Applying large language models for automated essay scoring for non

    Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated ...

  27. Hesburgh Library Desks: The Documented History of Students

    Somewhat blundering your way through the maze of books, you will eventually find collections of wooden panels joined together in a manner that I can only describe as "study cells." 3-inch thick wooden shields separate each desk from its neighbors, which I assume had the original intention of providing privacy and focus to the student who ...

  28. Opinion

    The poet Dylan Thomas told us we should "rage, rage against the dying of the light," that "old age should burn and rave at close of day.". He died, sadly, before turning 40. For those of ...

  29. Framing Generative AI in Education with the GenAI Intent and

    In the instructor-intent, learner-orientation mode, GenAI faces the learner on behalf of the instructor as an Instructor Proxy. In this mode, the GenAI's response is advocating on behalf of the instructor while interacting with learners. An example prompt might be: "Conduct a quiz that covers this week's reading.