Machine Learning and MBTI

ygolo · Dec 19, 2022

highlander said:
I think Transformers using neural networks may operate differently than you are thinking. We used a T5-small model, which provides eight-headed attention across the encoder and decoder, resulting in approximately 60 million parameters.

I am familiar with the transformer model, and the "Attention is all you need" paper. It's a very general maxim of avoiding overfitting. You'd need to explain the actual reason why this model isn't overfitting. At base whatever your network, not matter what it is, you are fitting a multidimensional model. If your wieghts/parameters are under-determined or even perfectly determined, you would be responding to noise. One test would be to do a cross-validation check of taking random splitting of training (restarting from your baseline) and test in multiple ways and measure the Bayesian Information Criterion the Akaike Information Criterion or some similar metric.

If your model had pretrained embeddings and transfer learning of a model previously trained, the need to have the number of parameters be small compared to your data points is no longer an issue (because the pre-training was done on ridiculous amounts of data to start).

Edit: Found out your model was indeed pre-trained.

https://jmlr.org/papers/volume21/20-074/20-074.pdf

highlander · Dec 20, 2022

ygolo said:
I am familiar with the transformer model, and the "Attention is all you need" paper. It's a very general maxim of avoiding overfitting. You'd need to explain the actual reason why this model isn't overfitting. At base whatever your network, not matter what it is, you are fitting a multidimensional model. If your wieghts/parameters are under-determined or even perfectly determined, you would be responding to noise. One test would be to do a cross-validation check of taking random splitting of training (restarting from your baseline) and test in multiple ways and measure the Bayesian Information Criterion the Akaike Information Criterion or some similar metric.

If your model had pretrained embeddings and transfer learning of a model previously trained, the need to have the number of parameters be small compared to your data points is no longer an issue (because the pre-training was done on ridiculous amounts of data to start).

Edit: Found out your model was indeed pre-trained.

https://jmlr.org/papers/volume21/20-074/20-074.pdf

I see. I don't want to oversimpify but don't independent dev and test datasets - evaluating loss and accuracy after several epochs tell us if we are overfitting? We got good results after about 4 epochs. Not overfitting yet at all.

highlander · Dec 20, 2022

Luminous said:
I can't imagine a reputable psychology journal publishing something that was created purely from forum posts and tested purely on forum posts. It may well be a great test, it may be accurate, but there's no way to prove that if it's created and tested in a vacuum with no standardized objective definitions of what any of the terms mean or whether anyone actually is the type they have listed on their profile.

It would be a really cool test for users of the forum, though.

These things are published in academic publications and conferences - not so much psychology journals. We still need to do some tweaking but I would like to publish it and have read many comparative papers. The paper is pretty damn good compared to others I have read.

highlander · Dec 20, 2022

Vendrah said:
What makes me believe that @highlander could have something in mind is this part:

This could be used for the masters, but for that you would need a Big Five (or HEXACO, but if @highlander doesn't know it, then Big Five is perhaps a better suggestion) rather than the MBTI. He could simply publish a paper with text prediction software for personality on the big five. The sad and scary part of it would be the corporate use for this.

Using text analysis bots to type isn't new, but almost all of them I know fail to do the correspondence, but in part it is related to context. So, for example, I did tried to throw people's journal/diary on the text analyzer and they

Vendrah said:
What makes me believe that @highlander could have something in mind is this part:

This could be used for the masters, but for that you would need a Big Five (or HEXACO, but if @highlander doesn't know it, then Big Five is perhaps a better suggestion) rather than the MBTI. He could simply publish a paper with text prediction software for personality on the big five. The sad and scary part of it would be the corporate use for this.

Using text analysis bots to type isn't new, but almost all of them I know fail to do the correspondence, but in part it is related to context. So, for example, I did tried to throw people's journal/diary on the text analyzer and they mostly orbit towards INFP even if they were ENTJ - because journaling is sort of an INFP thing on definitions.

I see hexacon is just an extension to the big 5. I think you are missing the overall point. If MBTI can accurately be predicted by analysis of text, it changes everything. It changes how people could be assessed and it changes perceptions on the validity of MBTI.

Luminous · Dec 20, 2022

highlander said:
I see hexacon is just an extension to the big 5. I think you are missing the overall point. If MBTI can accurately be predicted by analysis of text, it changes everything. It changes how people could be assessed and it changes perceptions on the validity of MBTI.

But what you're testing for is not MBTI, it's not the official MBTI, it's not been verified by an outside source or standardized in definition.

There is likely bias in that it might say more about which types are more popular rather than about anyone's actual personality.

You are trusting that people here have entered the correct MBTI and there are so many issues with this:

Many people here are not typed properly.
Many people have purposely entered a different type than what they actually think they type as.
Most people type by functions AND NOT DICHOTOMIES, which is how MBTI actually types.
- In addition, there are many differences in opinion about the definitions of the functions, if that is what has been used to list a type.
A point Vendrah brought up, where you draw text from is significant. The text in someone's diary will vary drastically from the text they write in an academic paper. People here use the forum in different ways. Some who are Thinkers write diary style blog posts while some who are Feelers don't blog deeply about their emotions for everyone here to read.

You've created and tested this in the same place with no method of outside verification of whether it's testing for what you say it's testing for. There needs to be proof that it is indeed testing for what you say it's testing for. I truly wish you would provide that outside verification because it is exciting that you may have created something that can do what you say. I just think you're putting the cart before the horse by claiming you know it does this without verifying it.

You've done a great job creating it. Just please test it more so that you can progress further in the direction you want to take it in.

ygolo · Dec 20, 2022

highlander said:
I see. I don't want to oversimpify but don't independent dev and test datasets - evaluating loss and accuracy after several epochs tell us if we are overfitting? We got good results after about 4 epochs. Not overfitting yet at all.

Not by itself. If you randomize the split of dev and test datasets and restart from the same baseline over and over again and get good results for the BIC or AIC, then yes. This is mainly because luck is a thing. I realize that the likelihood is small, but cross-validation makes the chance ridiculously small.

edit: Also, cheating is a more likely thing in the academic world at large, where people try all sorts of splits and gets one that "works". I realize that is not something you or I would do. But it is something a skeptical person would pose as a criticism.

The alternative is to have so much data compared to the number of parameters you are fitting that that this is not needed.

Using a transfer learning technique, assuming there isn't much forgetting, essentially taps in to the pre-training to do this. Though I personally would do some cross-validation randomizing the split of training and test sets. I would lean towards AIC instead BIC. But this may be overkill for a pre-trained model that has been proven to be robust, which T5 seems to be. I haven't looked enough at robustness to have a good grasp of it.

Vendrah · Dec 20, 2022

highlander said:
I see hexacon is just an extension to the big 5. I think you are missing the overall point. If MBTI can accurately be predicted by analysis of text, it changes everything. It changes how people could be assessed and it changes perceptions on the validity of MBTI.

You made all of this to prove the MBTI validity?
But it still not solve the fact that it is not falsifiable and its main issues, and as me and @Luminous had been saying, you're not even testing MBTI in literal terms.
Regardless, what you had created can be applied to lots of systems on personality, whatever type or not, whatever it is mine from the deep part of the forum, the enneagram, the MBTI... Just not the Zodiac, probably. You don't need to make it stuck to the MBTI.

This would be a good opportunity to add something related to big five label on the profiles, best would be percentages and at worst, SLOAN. Better late than never.

ygolo · Dec 20, 2022

Also, @highlander, I am not trying to discourage your publication. You put work into this, and have a story to tell.

I would favor acknowledging all the weaknesses--fixing what you can, and acknowledging what you can't.

I am not a fan of acidemia being restricted to only the high priests and priestesses with the resources and reputation to publish the most perfect prestigious paper.

Edit: For instance, if your work and Nardi's work were to corelate, this would indeed be something important in the field of personality. Even if there are still problems that need to be addressed in your own work (or Nardi's). This would be very different from a psychometric "type indicator". I am saying this as someone who has lost almost all credence in personality theories in general.

highlander · Dec 21, 2022

Luminous said:
But what you're testing for is not MBTI, it's not the official MBTI, it's not been verified by an outside source or standardized in definition.

There is likely bias in that it might say more about which types are more popular rather than about anyone's actual personality.

You are trusting that people here have entered the correct MBTI and there are so many issues with this:

Many people here are not typed properly.

Many people have purposely entered a different type than what they actually think they type as.

Most people type by functions AND NOT DICHOTOMIES, which is how MBTI actually types.

In addition, there are many differences in opinion about the definitions of the functions, if that is what has been used to list a type.

A point Vendrah brought up, where you draw text from is significant. The text in someone's diary will vary drastically from the text they write in an academic paper. People here use the forum in different ways. Some who are Thinkers write diary style blog posts while some who are Feelers don't blog deeply about their emotions for everyone here to read.

You've created and tested this in the same place with no method of outside verification of whether it's testing for what you say it's testing for. There needs to be proof that it is indeed testing for what you say it's testing for. I truly wish you would provide that outside verification because it is exciting that you may have created something that can do what you say. I just think you're putting the cart before the horse by claiming you know it does this without verifying it.

You've done a great job creating it. Just please test it more so that you can progress further in the direction you want to take it in.

I do have some experience in developing tests. I created the one on the forum and did some experiments in an attempt to improve the questions. The key thing I found through that exercise is that questions related to cognitive functions are not very good at predicting personality type. I know functions are the foundation of the system but based on my experience, I could not get them to work very well. The reason is that it is extremely common for people to have two functions - say Fe and Ni that are closely correlated in their ranking, perhaps with reverse order, and you often get the type wrong. The other thing you run into is people don't reliably test for function order in general. Function ordering tests aren't very good at discerning type as a result. The official MBTI test uses dichotomies and I understand why. They simply work better. The worst prediction accuracy in the forum test is the last letter - P vs. J. The reason it fails is I'm relying on function preferences and order in the other three letters to determine it. If I changed it to be a dichotomy evaluation, I'd get much more accurate results. I know what is accurate because I ask people if they have taken an assessment before, what their type is, and how sure they are that it is accurate. I should fix the last letter inaccuracy, but it would require a significant change in logic.

As to machine learning, the way it works is that you have a dataset with labels that you know to be reasonably accurate. You train the machine learning classifier on a set of examples with those labels, giving you a classifier that can predict the result. Then you run it against a test dataset and compare the predicted labels against actual labels. You can try different classifiers such as Naive Bayes, BERT, Support Vector Machines etc. to see which classifier works better. Then you can tune the parameters that you feed into the classifier and the number of epochs to reduce the amount of loss. Your test validates the result. There might be bias, but there are proven methods to measure these things as described here that we used in our analysis. There is also the issue of the more common types on the forum being wildly different than those in the general population. We had to adjust for that or the classifier would be biased towards those types that are more commonly on the forum. We balanced the data two ways through oversampling - first having an even distribution of each type and second having a distribution that matches types in the general population. The even distribution seemed to work fine.

You are absolutely right in that where you take text from makes a real difference. One of the recommendations in our paper is to find more diverse data than what we used. There is a problem though. Machine learning requires a certain volume of data. Other comparable studies used datasets of 8500 rows. The biggest one used 30,000 which included a combination of Reddit, PersC and Twitter posts. We trained on over a million rows, which helps make it more accurate. It's possible there is a dataset out there that is better than a personality forum, but I'm not sure where you would get it from. Again, the idea isn't to obtain perfection. You're going to have mislabeled data in any dataset that you would find. Errors are expected. The goal is to create something that is practically useful in its predictions. 80% is a good result. I did have an automated test on the forum for a few years that predicted Big 5 type through machine learning. It ran against Facebook, Tumblr, Twitter and Forum posts and it used an API from IBM which ran the classifier against the input text. You could select how many posts you wanted to feed into it. The Facebook and Twitter results were always pretty poor but the forum and Tumblr seemed to work a lot better, so if I were to add more diverse data, I'd probably consider Tumblr and Reddit as opposed to Twitter.

I should mention that we trained classifiers in two ways. The first way was to predict 4-letter type. The second was four binary predictions - one for each dichotomy. You could also make four binary predictions for cognitive functions. I wonder how that would turn out

. It would be interesting.

I have realized we made a mistake in our classifier and we need to make an adjustment. The end result is going to be less accurate but based on what we have so far, I think it is an interesting result.

ygolo · Jan 23, 2023

@highlander This may be one of the other papers you already mentioned. It may also be behind a pay wall for a lot of people. (It is IEEE Access, so it may be available for everyone).

In my morning paper perusal, I ran across this.

IEEE Xplore Full-Text PDF:

Machine Learning and MBTI

ygolo

My termites win

highlander

Administrator

highlander

Administrator

highlander

Administrator

Luminous

༻✧✧༺

ygolo

My termites win

Vendrah

Well-known member

ygolo

My termites win

highlander

Administrator

ygolo

My termites win

Similar threads