How to interpret the SUS learning ability score?

A quick search on Google leads me to MeasuringU:

The average SUS score is 68When we look at the scores of 500 products, we find that the average SUS score is 68. It is important to remember that the SUS scores are not percentages. Even if the SUS is between 0 and 100, these scaled scores are not a percentage. A score of 68 corresponds to 68% of the maximum score, but it falls just at the 50th percentile. It is best to express the raw number as a score and, when you want to express it as a percentage, convert the raw score to a percentile by comparing it to the database.

On this basis, the scores you have given seem rather low, whether the scores are based on percentiles or absolute values.


edit: apparently, I have misread the PO, and it only measures 2 questions, not the full SUS.
Although I expect multiplication by 5 to correct the problem, this is not entirely true. As noted in Section 2 of MeasuringU, the learning ability tends to deviate from the average, about 10%. This means that the average of all SUS-Learnability scores would be about 75 (68 * 1.1 = 74.8) for 10 questions, or 15 for two questions.

On this basis, 13.45 on average and 6.88 are still (somewhat) weak, although 15.58 is just above average.


As for the difference in the group scores, I suppose that one group had already used the software (or similar software) before? Using something for a while changes your view of the thing and this can have significant effects on the SUS scores;
enter the description of the image here

However…

If you use a system without knowing the fundamentals (eg rating system), the results are probably not very reliable. How do you know you asked the right questions? Do you have a context such as scores and comparable data on similar sites / applications? A music player can be MUCH easier to learn than a customer / billing database.

Reading between the lines, you seem to be looking for a simple system to show that your work is good. This simply does not exist, simplicity always comes at the expense of details and / or quality. MetaCritic has often been criticized for this. how can a number compare Toy Story to Schindler's list?

At best, you will probably be limited to comparing the usability of several iterations of your design. At worst, you'll be too distracted by this single number. Instead of looking at the SUS score, which is extremely broad and generic, try to build a survey based more on the actual use cases.

How easy was it to find a song in your app compared to competitors like Spotify and Itunes? Is it easy to compare the experience of 2 employees in one area? These specific questions are much more useful because they also indicate the direction to follow rather than simply a better / worse ranking.