Slovak Adaptation of the Big Five Inventory (BFI-2): Psychometric Properties and Initial Validation

The article describes the process of adaptation of the Big Five Inventory – 2 into the Slovak language and cultural context. The translation process of the Slovak BFI-2 was based on three data samples using item analysis and basic psychometric properties. The present study estimates the psychometric properties of the Slovak BFI-2 and its hierarchical structure using exploratory and confirmatory factor analysis in an independent sample of 526 participants recruited through an online research panel. It also provides data on convergent-discriminant validity in relation to alternative Big Five measures (NEO-FFI, TIPI) and to standard well-being measures. The results showed good internal consistency on the domain level, and somewhat lower on the facet level. Both exploratory and confirmatory factor analyses successfully recovered the conceptual structure of the Slovak BFI-2. The BFI-2 domains and facets showed adequate convergent-discriminant validity, based on the meaningful pattern of correlations with the other Big Five measures and well-being scales. These findings suggest that the Slovak version of the BFI-2 is a reliable and valid measure of the Big Five personality traits, and is appropriate for use in Slovak and cross-cultural research.


Introduction
In the past decades, the Big Five approach has become a widely accepted and well-validated model for the description and assessment of personality (Goldberg, 1990;John, Naumann, & Soto, 2008;McCrae & Costa, 2008). This approach identified five robust personality traits, which include neuroticism, extraversion, agreeableness, conscientiousness and openness to experience. The traits do not represent a particular theoretical perspective, rather, they were derived from analyses of the natural-language terms people use to describe themselves and others (John, Naumann, & Soto, 2008). They are generally found to be cross-culturally generalizable (McCrae, Terracciano et al., 2005), and show strong predictive validity for different areas of human behavior such as work (Brandstätter, 2011), romantic relationships (Malouff et al., 2010) and health (Vollrath, Knoch, & Cassano, 1999).
Many psychometric measures of the Big Five personality traits have been developed, having different complexity or length, such as the NEO Inventories (Costa & McCrae, 2010) or Ten Item Personality Inventory (Gosling, Rentfrow, & Swann, 2003). Some of these measures, such as the NEO PI-R and NEO-PI-3, use domain and facet approaches based on the assumption that personality traits are structured hierarchically (Goldberg, 1999;Soto et al., 2011). In this approach the Big Five domains are conceptualized as broad and general traits located at the top of the hierarchy. Each Big Five domain subsumes more-specific lower-level traits, referred to as facets. One of the most frequently used Big Five measure is the Big Five Inventory (BFI), which was originally developed as a brief, 44-item inventory that would allow efficient and flexible assessment of the Big Five, when there is no need for more differentiated measurement of facet traits within each trait domain (John, Donihue, & Kentle, 1991;John & Srivastava, 1999). The BFI does not use pairs of single adjectives, which are answered less consistently; instead, it uses short phrases based on trait adjectives known to be prototypical markers of the Big Five (John, Naumann, & Soto, 2008). The BFI has become widely used and psychometrically analyzed in many languages such as Spanish (Benet-Martinez & John, 1998), Dutch (Denissen et al., 2008), Czech (Hřebíčková et al., 2016), and others. Cross-cultural research in 56 nations (Schmidt et al., 2007) found that the five-dimensional structure of the BFI was highly replicable across all the major cultural regions of the world, and that the scales possessed high levels of internal reliability across all cultures. Although the BFI did not originally aim to measure traits at the facet level, Soto and John (2009) found that it could assess 10 facets that converge with facets assessed by the NEO PI-R.
Recently, Soto and John (2017a) introduced a new version of the Big Five Inventory, named BFI-2. It is designed to integrate new advances in personality structure and psychological assessment into the BFI, while still retaining three key strengths of the original measure: concep-tual focus, ease of understanding, and brevity of assessment time. The BFI-2 tries to ensure an appropriate balance between bandwidth and fidelity (John, Hampson, & Goldberg, 1991) by adopting a hierarchical approach using domains and facets level scales. While domain scales are construed with greater breadth (i.e., high bandwidth), facet scales provide more-detailed personality description (i.e., high fidelity). The BFI-2 is also designed to minimize the influence of acquiescent response style (Rammstedt, Danner, & Bosnjak, 2017), which can threaten the validity of questionnaire-based data (e.g., Rammstedt, Kemper, & Borg, 2013;Soto et al., 2008), by balancing the number of true-keyed and falsekeyed items. This allows researchers to easily control for acquiescence at the item level by centering each individual's set of item responses around their within-person mean (see Soto & John, 2017a;Soto et al., 2008). The BFI-2 also adopts new labels for two domains: Neuroticism is replaced by the label Negative Emotionality, which better represents the focus of this domain on negative emotional experiences and more clearly distinguishes it from psychiatric illness, and Openness, which was replaced by the label Open-Mindedness due to possible misinterpretation in terms of openness to social experiences.
Psychometric evaluation of the English-language version of the BFI-2 showed that it has good reliability at both the domain and facet levels and a robust factor structure. The BFI-2 also predicts conceptually relevant behavioral and psychological criteria in a meaningful way, with greater predictive power than the original BFI (Soto & John, 2017a). Analysis of gender differences showed that, similar to previous research with other measures (e.g., Costa, Terracciano, & McCrae, 2001), women tended to describe themselves as somewhat more extraverted, agreeable, conscientious, and emotional than men did (Soto & John, 2017a). The BFI-2 has been translated and psychometrically analyzed in German and Dutch languages (Danner et al., 2019;Denissen et al., in press). Both studies confirmed that the structure found in the English version was replicated in the local adaptations. Moreover, both versions showed good reliability at the domain level and sufficient reliability at the facet level and good validity as examined by correlations with other personality inventories and external criteria related to different life domains. To sum up, the main advantages of the BFI-2 over other Big Five measure are that it a) provides personality assessment at both the domain and facet levels with relative brevity of assessment time and b) balances the number of true-keyed and false-keyed items in order to minimize the influence of acquiescent response style.
W ith this background in mind, the present research has two key goals. The first is to develop a Slovak version of the BFI-2 and to examine its psychometric properties such as internal consistency and hierarchical structure. The second is to extend the knowledge of the construct validity of the BFI-2 by examining its associations with two additional Big Five measures and selected well-being criteria. W ell-being measures have been chosen as a validity criterion because previous research has found robust and consistent relationships of the Big Five traits with different aspects of wellbeing (e.g., Hayes & Joseph, 2003;Gutierrez, 2005). As well-being is considered a complex construct with different aspects, we decided to include several variables related to positive or negative psychological functioning, namely satisfaction with life, happiness, self-esteem, meaning in life and perceived stress. This strategy can provide more complex insight into validity of the BFI-2, and can extend our knowledge of the BFI-2's validity by examining select well-being criteria that have not been previously investigated. Our validity hypotheses, based on previous research, were that extraversion, agreeableness, and conscientiousness will have a positive relationship with wellbeing, whereas negative emotionality will have a negative relationship.
These aims are important for two audience types. The first are Slovak researchers, who use personality trait measures in their research and who can get information about this new Big Five measure with strong conceptual clarity and robust hierarchical structure. The second are cross-cultural personality researchers, especially those who are interested in cross-cultural data related to the Big Five traits and their relations with other variables across cultures. The study presents data from an Eastern European country that is frequently underrepresented in large cross-cultural studies using the Big Five approach (e.g., Costa, Terracciano, & McCrae, 2001;Rammstedt, Kemper, & Borg et al., 2013). The results could help to fill this gap and contribute to knowledge related to cross-cultural applicability of the BFI-2 and the Big Five model in general.

Development of the Slovak BFI-2
The Slovak BFI-2 was developed through a translation and back-translation process led by the first two authors of this paper and supervised by the original BFI-2 authors. After developing a preliminary pool of item translations, the final selections were made based on item analyses and basic psychometric properties in three independent scale-development samples. The final version of the Slovak BFI-2 was found to have satisfactory psychometric properties and factor structure in these samples. A full description of the translation procedure, samples and descriptive characteristics, and results of exploratory and confirmatory factor analysis for these pilot studies are presented in Supplementary online material A. Building on these preliminary results, the present study aims to examine the reliability, structural validity, and external validity of the Slovak BFI-2 in an independent, general adult sample.

Sample
The sample in the present study consisted of 542 participants, 268 males (49.5%), 274 females, who completed an online version of the Slovak BFI-2 and other measures of Big Five personality traits and well-being. The data collection was performed in October and November of 2017. Participants were recruited through an online research panel, and were compensated for their participation by small credits that could be exchanged for different products. Age of the participants ranged from 18 to 86 years, with a mean of 41.79 (SD = 14.57). Nine participants (1.7 %) had primary level of education, 307 (56.6 %) had secondary level of education, and 226 (41.7%) had a university degree. All participants were informed about the goals of the study and they provided informed consent prior to the data collection.

Measures
Big Five measures. All participants answered demographic questions and completed the Slovak BFI-2. For validation of the BFI-2, two other Big Five questionnaires available in the Slovak language were used. The 60-item NEO-Five Factor Inventory (NEO-FFI, Costa & McCrae, 2010;Slovak version Ruisel & Halama, 2007) is a shorter version of the 240item NEO PI-R, aimed to be used in situations in which general information on the domain level of personality is sufficient. It assesses each Big Five domain using a 12-item scale, with items rated on a 5-point Likert-type scale. Alpha reliabilities in the present sample were .83 for Neuroticism, .80 for Extraversion, .67 for Openness, .76 for Agreeableness and .88 for Conscientiousness The Ten Item Personality Inventory (TIPI) was constructed by Gosling, Rentfrow, and Swann (2003;Slovak translation Halama & Gurňáková, 2014) as a very short self-report measure through a selection of adjectives from previous Big Five measures. The inventory contains 10 unipolar items with two adjective markers for each item and with two items for each Big Five trait. The items are rated on a 7-point scale (from Disagree strongly to Agree strongly). Alphas in the present sample were generally low due to the small number of items: .27 for Extraversion, .41 for Agreeableness, .66 for Conscientiousness, .64 for Emotional Stability, and .28 for Openness.
Well-being scales. The Oxford Happiness Questionnaire (OHQ) was developed from its longer version (Oxford Happiness Inventory) as a brief but well validated measure for assessing happiness in its broad sense (Hills & Argyle, 2002;Slovak translation Babinčák & Pipasová Karolová, 2014). It contains 8 items focusing on different aspects of happiness and well-being, with a 6-point Likert scale provided for response. Psychometric analysis (Hills & Argyle, 2002) showed that OHQ has good reliability and validity when correlated with its longer version, and with personality scales usually associated with well-being. The scale's alpha reliability in the present sample was .74.
The Satisfaction with life scale (SWLS) was created by Diener et al. (1985) to assess satisfaction with the respondent's life as a whole. It is a short, 5-item scale and respondents indicate the extent to which they agree with each item on a seven-point Likert scale, ranging from strongly agree to strongly disagree. The SLW S is a very frequently used scale to as-sess the cognitive aspect of well-being in many languages, and it has good convergent validity as well as temporal stability (Pavot & Diener, 2009). It was translated into Slovak by Halama and Dědová (2007). Its alpha reliability in the present sample was .90.
The Meaning in Life Questionnaire (MLQ) was constructed as a measure of meaning consisting of two subscales (Steger et al., 2006). The Presence subscale assesses cognitive appraisals of whether life is meaningful, and the Search subscale assesses general tendencies to actively seek meaning and purpose in life. The questionnaire has 10 items (5 for each subscale) with a 7-point Likert-type response format. The authors (Steger et al., 2006) showed its good discriminant validity and stable factor structure. The Slovak translation used in the study comes from the scale author's official webpage, which does not provide authorship information for the Slovak translation. The alphas in the current sample were .89 for Presence and .80 for Search. The Rosenberg Self-Esteem Scale (RSES) is a 10-item scale that measures global selfesteem (Rosenberg, 1965;Slovak translation Ficková, 1999). It has been widely used in research on self-esteem in different contexts and countries (e.g., Schmidt & Allik, 2005). It uses a 4-point rating scale format (ranging from absolutely disagree to absolutely agree) with five positively worded items and five negatively worded items. Many studies have shown it to have good reliability and validity (e.g., Pullman & Allik, 2000;Halama, 2008). The scale showed internal consistency of .87 for our sample.
Finally, the Perceived Stress Scale (PSS; Cohen, Kamarck, & Mermelstein, 1983; Slovak translation, Halama & Bakošová, 2009) is a measure of an individual's appraisal of his or her life as stressful. The scale is available in different lengths, and the version used in this study contained 10 items rated by the participant on a 5-point Likert-type scale. The questions focus on the global perception of stress experienced during the previous month. The authors claimed that the PSS-10 showed ad-equate reliability and showed its validity through correlations with life event scores, depressive and physical symptomatology, and other external criteria (Cohen, Kamarck, & Mermelstein, 1983). For this measure, alpha was .86 in the current sample.

Results
The results of descriptive and reliability analysis (Table 1) showed that domain alpha reliabilities for the Slovak BFI-2 ranged from .79 to .83 (M = .82). For facets, alphas ranged from .43 to .73 with a mean of .63, which is somewhat lower than in the original English study (M = .77). A similar decrease in internal consistency was also observed for the German BFI-2 (Danner et al., 2019), and is fairly typical when adapting psychological measures across cultural contexts. However, lower internal consistency could also reflect the data quality of the sample used. To investigate this possibility, we compared the corrected itemtotal correlations for the BFI-2 domains and facets with those for the NEO-FFI domains. Overall corrected item-total correlations means were similar, .47, .42 and .45 for the BFI-2 domains, facets and NEO-FFI domains respectively, suggesting that the lower alphas of the BFI-2 facets reflect their brevity and the overall data quality of this sample, rather than a problem specific to the Slovak BFI-2.
An analysis of gender differences showed that females scored significantly higher than males in Agreeableness and its facets, Extraversion and its facets Sociability and Energy Level, Open-Mindedness and its facet Aesthetic Sensitivity, as well as the facets of Responsibility and Anxiety. These gender differences were small to medium in size, ranging from .01 in Depression to .54 in Compassion (M = .30). Column-vector correlations comparing the overall pattern of gender differences obtained here with those in the original validation study for the English-language BFI-2 (Soto & John, 2017a) was .42 for the English online sample and .52 for English student sample. This indicates a moderately similar pattern across studies.
Correlations with age revealed positive age trends for Agreeableness, Conscientiousness, and their facets, as well as a positive age trend for the Aesthetic Sensitivity facet and a negative trend for the Emotional Volatility facet. All of these age trends had effect sizes of .10 to .20, and were consistent with previous research on adult personality development (e.g., Soto et al., 2011).
The Big Five factor structure of the BFI-2 items was assessed using random intercept exploratory factor analysis (Aichholzer, 2014), which includes a method factor to model individual differences in acquiescent responding (cf. Soto & John, 2017b). This analysis was conducted using Mplus 7.4; because a Mardia test suggested violations of the multivariate normality assumption, robust maximum likelihood was chosen as the method of estimation. Fifty-five items (90%) had their primary loading on the intended domain, with loadings ranging between .21 and .68 (M = .48). In con-trast, absolute secondary loadings ranged from .00 to .47 (M = .12). Similarly, a PCA of the 15 facets showed that all facets loaded primarily on their intended domain. Primary loadings ranged from .60 to .88 (M = 0.77), while absolute secondary loadings ranged between .01 and .46 (M = .16), which suggests a very clear domain-level factor structure. Tables with results of these analyses are presented in Supplementary online material B.
A series of confirmatory factor analyses was used to verify the hierarchical structure of the Slovak BFI-2, with three facets nested within each Big Five domain. This analysis was carried out in the R statistical software environment, using the Lavaan package and robust maximum likelihood estimation. In the single domain model, every item loads on a single factor representing the Big Five domain. In the single domain plus acquiescence model, every item was additionally constrained to load 1 on an acquiescence method factor. Facets were modeled in the three facets model, in which each item loaded on its corresponding facet factor and, lastly, the acquiescence method factor was added in the three facets plus acquiescence model. As expected, the three facets plus acquiescence model had the best fit for each Big Five domain, with a CFI value of at least .923, TLI of at least .898, and RMSEA of no more than .068 for each domain (see Table 2). These results confirm the facetlevel structure of the Slovak BFI-2 and the need to account for acquiescence when modeling item responses.
Correlational analysis of the BFI-2 domains and facets (see Table 3) showed that absolute correlations between BFI-2 domains ranged from .28 to .49 (M = .40). These correlations are higher than in the original English version (Soto & John, 2017a), and may reflect the fact that discriminant correlations tend to be higher in paid research panels than in student and self-selected volunteer samples. This interpretation was supported by similarly inflated intercorrelations for the NEO-FFI in the present sample (range = .11 to .47, M = .30), as compared with those previously obtained in Slovak NEO-FFI standardization samples (range = .06 to .27, M = .14) based on students and self-selected volunteers (Ruisel & Halama, 2007). At the facet level, the Slovak BFI-2's mean within-domain facet correlation ranged between .42 and .67 (M = .54), while absolute between-domain facet correlations were lower, ranging from .03 to .55 (M = .27).
Convergent validity was assessed through correlations of the BFI-2 with the NEO-FFI and TIPI (Table 4). Same-trait different-method correlations show good convergence between BFI-2 and NEO-FFI, ranging from .63 to .77 (M = .72). As expected, correlations between BFI-2 facets and convergent NEO-FFI domains were somewhat lower on average (M = .60, ranging between .36 and .69), reflecting the distinctions between same-domain facets. Mean convergent correlations with the TIPI were .63 (ranging from .49 to .76) for the BFI-2 domain scales and .52 (ranging from .32 to .68) for same-domain facet scales. As expected, discriminant correlations between different domains were lower, averaging .32 (between .09 and .51) in size with the NEO-FFI and .28 (between .07 and .53) with the TIPI. Discriminant correlations of the facet scales averaged .27 (between .04 and .60) with the NEO-FFI and .23 (between .01 and .52) with the TIPI. The strongest of these correlations are conceptually meaningful, such as the negative correlations between Extraversion and Neuroticism/Negative Emotionality. Table 5 presents external validity correlations and predictive power of the BFI-2 for well-being measures. Generally, all domains except Negative Emotionality showed positive correlations with positive indicators of psychological well-being and negative correlations with the Perceived Stress Scale. On average, the strongest absolute correlations of these wellbeing measures were found with the Negative Emotionality (M = .46) and Extraversion (M = .43) domains, and with the Depression (M = .50) and Energy Level (M = .42) facets. We also compared the predictive power of the BFI-2 domains vs. facets for well-being measures using multiple regression analysis and R 2 val- ues as criteria. These analyses showed that the BFI-2 domains had somewhat lower predictive power than the facets, with mean determination coefficients of .35 for domains vs. .38 for facets. These results suggest a rather modest, 10% relative increase in predictive power for the BFI-2 facets over the domains.

Discussion
The main goals of the present research were to develop the Slovak version of the BFI-2 questionnaire, and to report its psychometric characteristics and associations with other Big Five questionnaires and selected well-being measures. Concerning reliability, the Slovak BFI-2 shows very good internal consistency at the domain level. At the facet level, the alpha coefficients were generally good, although some facets were more internally consistent than others. Similar results obtained in other lan-guage adaptations of the BFI-2, such as German and Dutch (Danner et al., 2019;Denissen et al., in press), as well as comparisons with the Slovak NEO-FFI in the present research, suggest that some of the lower facet alpha reliabilities obtained here likely reflect the general difficulty of adapting psychological measures across cultures, as well as the overall data quality of paid online samples, rather than an issue specific to the Slovak BFI-2. These considerations may also explain our finding of moderate-to-large discriminant correlations between some BFI-2 domain scales. In their validation study of the Dutch BFI-2, Denissen et al. (in press) noted substantially poorer discriminant correlations in a paid online sample than in a student sample, and our paid sample showed similarly higher-than-normal discriminant correlations for both the BFI-2 and the NEO-FFI. However, additional research using different samples and measures is needed to confirm or refute these interpretations. Until then, we recommend caution in interpreting the Slovak BFI-2 facets with lower internal consistency, and we recommend that researchers keep discriminant correlations in mind when interpreting the Slovak BFI-2 domains.
Factor and principal components analyses suggested that the Slovak BFI-2 retains the measure's intended structure at both the domain and facet levels. The vast majority of items loaded primarily on their intended component with primary loadings substantially higher than secondary loadings. Principal component analysis of facets clearly recovered the intended BFI-2 structure, with three facets loading on each Big Five domain. Moreover, CFAs successfully replicated the results of the original BFI-2 validation study, in which the items within each Big Five domain could be adequately fit by a measurement model that included three substantive facet factors plus an acquiescence method factor (cf. Soto & John, 2017a). The results not only showed that the Slovak BFI-2 has the same robust hierarchical structure as the original English version, but also confirmed that acquiescence should be taken into account when studying questionnaire factor structure (Rammstedt et al., 2013;Soto et al., 2008). The BFI-2 minimizes the effect of the acquiescence through balancing of the true-keyed and false-keyed items for each facet and domain scale, and PCAs of the 15 facets suggested that this effectively controls for acquiescence. However, the CFA re- sults clearly suggest that acquiescence should be accounted for as a method factor when modeling BFI-2 structure at the item level. As suggested by Soto and John (2017a), the BFI-2 is not only an example of effective control for acquiescence, but also a promising tool for future research examining the phenomenon of acquiescent responding itself through indexing or modeling individual differences in acquiescence across the content-balanced BFI-2 item set.
Validity of the BFI-2 was further examined through associations with three types of variables. First were the demographic variables of gender and age. Our results revealed patterns of age and gender differences similar to those obtained in previous Big Five research, as well as in the original BFI-2 study (Costa, Terracciano, & McCrae, 2001;Soto & John, 2017a;Soto et al., 2011). Second, correlations of the Slovak BFI-2 with the NEO-FFI and TIPI also confirmed that the BFI-2 domains showed good convergence with both of these alternative measures. Third, correlations with selected well-being measures revealed a meaningful pattern of associations at both the domain and facet levels, as well as distinctive profiles of personality correlates for some well-being indicators (e.g., perceived stress, search for meaning in life). These results support the construct validity of the Slovak BFI-2, and suggest that it can be recommended as a reliable measure of Big Five domains and facets in the Slovak environment and cross-cultural research.

Limitations and Further Research
As mentioned above, the main limitation of our research is the specific sample characteristics. In this study, we used respondents recruited from a paid online research panel, which may have affected data quality. Although early research using paid online samples such as Amazon Mechanical Turk did not observe a substantial effect of reasonable compensation on general data quality (Buhrmester, Kwang, & Gosling, 2011), more recent research has observed differences in data qual-ity between paid online panels and student or volunteer samples (e.g., Denissen et al., in press). However, further research that administers the BFI-2 and other psychological measures to Slovak samples drawn from alternative sources could help clarify this issue.
Another limitation is that our study did not include peer-reported data for either the BFI-2 or the validity criteria. Therefore, future research could examine self-peer agreement, and also test the validity with peer-reported criteria. A third notable limitation is the rather narrow range of validity criteria, which focused specifically on well-being. Big Five personality traits have been shown to predict a broad range of cognitive, emotional and behavioral variables (e.g., Ozer & Benet-Martínez, 2006;Soto, in press). Our focus on selected wellbeing measures allowed us to examine this criterion domain in greater detail; however, many other variables remain unexamined. Further research including other criteria could provide more information about how the Slovak BFI-2 relates with a broad range of psychological variables, and how useful it can be for predicting consequential outcomes.

Conclusions
The present paper reports the development of the Slovak version of the BFI-2, its psychometric properties, and its capacity to predict selected well-being criteria. Based on descriptive and correlational analysis, alpha coefficients, and exploratory and confirmatory factor analysis, we can conclude that in general, the Slovak BFI-2 shows satisfactory psychometric properties, as well as a robust hierarchical factor structure. Moreover, the Slovak BFI-2 displays associations with gender, age and other Big Five measures that are generally consistent with previous research and theoretical assumptions. Finally, we found a meaningful pattern of validity correlations between the Slovak BFI-2 and well-being measures at both the domain and facet levels. We therefore recommend the Slovak BFI-2 for use in both Slovak and international personality research. We expect it will be a particularly valuable tool for researchers who wish to efficiently measure personality traits at both the Big Five domain and facet levels. Future research can replicate the findings using different samples, estimate additional psychometric properties of the Slovak BFI-2, and establish its predictive validity in greater detail.