Personal pronouns are one of the earliest areas of study when learning a new language. Inparticular, ‘I’, together with ‘I am’ or ‘I’m’, is a word that a learner of English need to masterquickly in order to be able to communicate. For a Swedish learner of English writing English,there are two difficult aspects to come […]
To start, you canPersonal pronouns are one of the earliest areas of study when learning a new language. In
particular, ‘I’, together with ‘I am’ or ‘I’m’, is a word that a learner of English need to master
quickly in order to be able to communicate. For a Swedish learner of English writing English,
there are two difficult aspects to come to grips with here, firstly that the pronoun is written
with a capital letter despite which position it is in within a sentence, a concept which does not
exist in Swedish. Secondly, if the learner wants to use the contracted form used in spoken
English, there is a need to use an apostrophe which is also a concept rarely used in Swedish
and then it is not used for contractions. The study “Error analysis: A study of Swedish junior
high school students’ texts and grammar knowledge”(referens), analyses the most frequent
grammatical errors made by junior high school students in the ninth grade as well as …… , in
controlled and free writing productions, shows that capitalization and contractions are the
categories with the lowest frequency of errors. In addition to this, it is easy to assume that
learners of English in high school and even junior high school, who began studying English at
the age of ten at the latest and often earlier, depending on the school, would not make a
mistake in writing this very common personal pronoun. However, preliminary search results
show that errors in this area are common and sufficiently frequent for a study. By using the
full dataset of the corpus ULEC, this essay will analyse the four possible variables of the
contracted form ‘I’m’: ‘I’m’, ‘i’m’, ‘Im’ and ‘im’, and use the following hypothesis for a
quantative analysis:
Age is a factor for Swedish learners of English err when using ‘I’m’ in written
productions.
The study includes search data from the sixth grade to the third year of high school, i.e. seven
age groups. There will no consideration taken to whether or not several tokens possibly come
from the same students. By manually counting all of the tokens generated from the search and
comparing the results age group by age group, and then normalize the results, numbers will
emerge that will or will not reflect the research question. The analysis in this essay will firstly
go through the search results one-by-one, starting with data generated by the search strings.
Thereafter, there will be an analysis of the different variables and the age groups before a
discussion of the results and reflections. There will be an appendix with the calculations
behind the figures included in this essay.
The initial search string of ‘I’m’ generated 322 tokens, including the variable ‘i’m’.
However, the possibility of other spelling errors made it necessary to also use the search
string ‘Im’ which generated 127 tokens, including the variable ‘im’. This adds up to 449
tokens and out of these 287 are the correct form of ‘I’m’ and a total of 162 incorrect forms,
divided into three variations, 35 tokens of ‘i’m’, 46 tokens of ‘Im’ and 81 tokens of ‘im’.
Table1.
Variables Search tokens %*
I’m
i’m
Im
im
287
35
46
81
64%
8%
10%
18%
Total 449 *The decimals
have been
rounded off to
the closest whole
number
According to these numbers, it is clear that nearly two thirds of the generated tokens are
written without any errors. The most frequent error is ‘im’, i.e. 18% of the tokens are written
neither a capital letter nor a contraction. Interestingly, this double-error is more frequent than
the two other variables containing only one of the errors. This variable accords the most with
the usage of capital letters and contractions in Swedish language.
However, since the number of tokens of included in each of the variables are different, the
numbers above does not represent the four variables in a truthful way. There is a need for
normalization, and by first dividing the total number of tokens per age group with the total
number of words in the database for each age group, a new sum emerges. This sum is then
multiplied by 10,000 and the final sum of each age group is finally added in order to get a
proper representation of each of the four variables.
Figure 1.
Normalized word count per 10,000 words
0
10
20
30
40
50
60
70
80
90
83.66
18.5
10.52
18.06
The four variables
I’mi’m Imim
According to these numbers, the most common spelling mistake is the variable ‘im’ which is
in accord with the numbers in table 1, and it is possible that this is an indication of the
Swedish language system making itself known since there are no capitalized pronouns or
apostrophes used as an indicator of contracted language in Swedish. Additionally, the
frequency of the correct variable, ‘I’m’, is by far the largest, which it ought to be at the ages
included in this essay.
The objective of this essay is to analyse whether or not age is a factor in making errors
when contracting ‘I am’ in written productions. Therefore it makes sense to first analyse if
age is a factor in writing said phrase correctly, i.e. the assumption being that the oldest
students should have the highest frequency of correct answers.
Figure 2.
Normalized word count per 10,000 words within each age group
0
5
10
15
20
25
8.7
20.4
7.44
10.87
2.61
12.91
20.73
I’m
Gy_3Gy_2Gy_1Hs_9Hs_8Hs_7M_6
As we can see from the figure above, the numbers do not support the assumption made above
about the oldest students having the highest frequency of correct answers. Interestingly, the
age group with the highest frequency of correct spelling is the youngest age group, the grade
six, very closely followed by second year of high school. The oldest students are not even in
third or fourth place but fifth. The first-year students of high school have the sixth highest
frequency, which is the second lowest overall, but still significantly higher than the eight-
graders who have a very low frequency compared to the other age groups.
The analysis will now focus on the three incorrect variables and the frequency of words
within each of the age groups and will display them side-by-side.
Figure 3.
i’mImim
0
1
2
3
4
5
6
7
8
7.46
1.74
1.29
4.13
6.97
0.65
1.62
2.592.72
1.48
2.47
0.87
2.61
0.87
1.36
0.68
2.04
4.15
1.38
i’m, Im, im, normalized total word count per 10,000 words
per age group
Gy_3Gy_2Gy_1Hs_9Hs_8Hs_7M_6
From the numbers above, it is clear that the oldest students have the highest frequency in
missing the capitalized pronoun, i.e. writing a lower-case ‘i’. It is noteworthy that neither the
oldest nor the youngest students have any representations within the variable ‘Im’. Moreover,
the double-error variable, ‘im’, has a very high representation from the second-year students
in high school. To make the picture even more clear, the next figure will juxtapose the correct
variable and the incorrect variables, added together into one group.
Figure 4.
Gy_3Gy_2Gy_1Hs_9Hs_8Hs_7M_6
0
5
10
15
20
25
8.7
20.4
7.44
10.87
2.61
12.91
20.73
9.2
12.39
4.86
6.67
4.354.08
5.53
Comparison of correct and incorrect spelling per age group
correct spellingincorrect spelling
List of references:
The corpus ULEC, full dataset
http://www.diva-portal.org/smash/get/diva2:496190/FULLTEXT01.pdf
Appendix
The results of the calculations have been rounded off to two decimals.
Search strings used:
Search results:
Total number of tokens:
I’m i’m Im im
Gy_3 35 3 0 7 = 45
Gy_2 79 5 16 27 = 127
Gy_1 92 8 20 32 = 152
Hs_9 44 11 6 10 = 61
Hs_8 3 1 3 1 = 8
Hs_7 19 2 1 3 = 25
M_6 15 3 0 1 = 19
Total: 287 35 46 81
Normalizing total number of tokens within each age group:
Gy_3 = 45/4024110,000 = 11, 18 Gy_2 = 127/3873310,000 = 32,79
Gy_1 = 152/12365210,000 = 12, 29 Hs_9 = 61/4049510,000 = 15,06
Hs_8 = 8/1151310,000 = 6,95 Hs_7 = 25/1471910,000 = 16,98
M_6 = 19/7236*10,000 = 26,26
Normalizing total number of tokens within each age group and variable:
I’m
Gy_3 = 35/4024110,000 = 8,7 Gy_2 = 79/3873310,000 = 20,4
Gy_1 = 92/12365210,000 = 7,44 Hs_9 = 44/4049510,00 = 10,87
Hs_8 = 3/1151310,000 = 2, 61 Hs_7 = 19/1471910,000 = 12,91
M_6 = 15/7236*10,000 = 20,73
I’m
Gy_3 = 3/4024110,000 = 7, 46 Gy_2 = 5/3873310,000 = 1,29
Gy_1 = 8/12365210,000 = 0,65 Hs_9 = 11/4049510,00 = 2,72
Hs_8 = 1/1151310,000 = 0,87 Hs_7 = 2/1471910,000 = 1,36
M_6 = 3/7236*10,000 = 4,15
Im
Gy_3 = 0
Gy_2 = 16/3873310,000 = 4,13 Gy_1 = 20/12365210,000 = 1,62
Hs_9 = 6/4049510,00 = 1,48 Hs_8 = 3/1151310,000 = 2, 61
Hs_7 = 19/1471910,000 = 12,91 M_6 = 15/723610,000 = 20,73
Select your paper details and see how much our professional writing services will cost.
Our custom human-written papers from top essay writers are always free from plagiarism.
Your data and payment info stay secured every time you get our help from an essay writer.
Your money is safe with us. If your plans change, you can get it sent back to your card.
We offer more than just hand-crafted papers customized for you. Here are more of our greatest perks.