Trust in HR AI Starts with Evidence: A KAIST Joint Experiment on TalentGPT for Global Hiring

Key Summary

The question this article addresses

“Regardless of whether someone has domain expertise, can AI meaningfully reduce the real-world burden of search and judgment in global hiring?”

Experimental design

🔹 Co-designed with KAIST researchers
🔹 Comparison of two conditions: manual filter setup vs. AI filter recommendations based on a natural-language JD
🔹 Participants explored candidates based on a given IT role JD and selected a final set of 3 candidates

Key results (statistically significant)

🔹 Search preparation time was reduced by an average of 75.7 seconds, enabling the system to absorb the increased upfront cost of global hiring that grows with country and language differences
🔹 Task dropout rate decreased by 11.85 percentage points, reducing “failed starts” in global talent exploration

Conclusion

TalentGPT can mitigate the problem where expertise gaps turn into performance gaps—not because AI “makes the decision,” but because it absorbs the burden of judgment at the system level.

An era when everyone must become a domain expert—yet that is impossible

<Figure 1. Wall Street Journal, 2025: “Why Companies Struggle to Find the Talent They Want in a Sea of Skills”>

Today, companies face an increasingly complex talent market. In particular, for IT and technology-centered roles, the breadth and depth of required skills are expanding rapidly, and as a result, the level of domain understanding demanded in hiring continues to rise. The World Economic Forum notes that the average “shelf life” of core competencies required for technical roles is shrinking quickly, putting HR in a structural position where it is difficult to keep pace continuously (WEF, 2024).

Under these conditions, an implicit assumption often takes hold: that recruiters themselves must become increasingly advanced domain experts. But in reality, this assumption does not hold. The OECD analyzes that as job differentiation and technical specialization accelerate in the global labor market, it becomes structurally impossible for HR professionals to possess sufficient expertise across all roles and technical domains (OECD, 2023).

The key point is that this limitation is not simply an individual capability issue. As the speed of technological change increases, the burden of judgment on HR accumulates, which leads to slower hiring and suboptimal matches. In other words, the very structure that assumes everyone must become an expert is already an inefficient design.

Global and cross-border hiring is not an option but an inevitability

This problem is amplified even further in global and cross-border hiring environments. In a recent report, LinkedIn Economic Graph analyzes that cross-border hiring in IT and digital roles is structurally increasing, and that this is not a short-term trend but a medium-to-long-term shift in employment structures (LinkedIn Economic Graph, 2024).

The reason companies are forced to expand into global talent pools is clear. Within a single country, it is no longer feasible to secure enough talent reliably, and sustaining technological competitiveness requires connecting talent across multiple countries and regions. McKinsey similarly notes that companies that fail to build global hiring strategies in the competition for technical talent are likely to lose competitiveness in the long run (McKinsey, 2023).

However, global hiring is not simply a matter of widening the hiring scope. Job titles, how experience is described, and how tech stacks are expressed differ substantially across countries; even people with the same competencies often structure their resumes using entirely different language and context. As a result, HR is placed in a situation where the same person must be interpreted through different standards.

The essence of HR is not “perfect understanding,” but “fast fit judgment”

What matters is that HR’s purpose is not to understand every candidate perfectly. In real hiring practice, HR’s core role is to quickly identify candidates who fit the organization and the role within limited time. This is the essence of HR, whether hiring is global or local (Gartner, 2024).

The problem is that existing hiring systems do not adequately support this essence. Keyword-based search and manual filtering place excessive judgment responsibility on HR; when domain knowledge is insufficient, exploration cost and cognitive burden rise sharply. This ultimately creates a paradoxical situation: “Spending more time so as not to miss good talent.”

In global hiring, this burden increases exponentially. HR must consider not only technical fit, but also company culture, team context, and fit with local environments. A structure that depends entirely on individual expertise for such judgments is no longer sustainable.

The core of AI hiring debates is “trust,” and trust is a validation problem

To address these limitations, AI is being introduced into HR. But discussions about AI in hiring consistently converge on the issue of “trust.” Multiple studies report that when recruiters perceive AI’s judgment as opaque, they may avoid the algorithm or discontinue use altogether (Dietvorst et al., 2015; Lacroux & Martin-Lacroux, 2022).

The critical point is that trust in HR cannot be secured through declaration. The claim “AI is more accurate” is not sufficient in real work contexts. HR is a domain that involves legal and ethical accountability, and in global hiring, the cost of incorrect judgment becomes even greater. Therefore, trust in AI must be formed through verifiable evidence grounded in actual usage contexts.

In this context, recent research in HCI and information systems points out that trust is more likely to form when AI does not “replace” judgment, but instead structurally reduces the user’s burden of judgment (Ahn et al., 2021; Hosanagar & Lee, 2023).

Can TalentGPT be a real lever to solve these global cross-border challenges?

Starting from this problem framing, TalentGPT chose an empirical validation experiment—not merely a functional proposal or a conceptual proof. This study was designed together with KAIST researchers. It focused not on whether HR AI can “replace experts,” but on whether it can reduce judgment burden so that differences in expertise do not translate into differences in outcomes.

The experiment’s core question was clear:

“Regardless of whether a recruiter has domain expertise, can AI meaningfully reduce the burden of search and judgment in a global hiring environment?”

**<Figure 2. Expected changes in candidate search activity depending on TalentGPT usage and IT expertise (domain expertise)>**

To test this, participants were divided into two groups based on whether they had IT knowledge. One group was framed in an HR role with software and AI hiring experience, while the other group was set as an HR role with only general hiring experience. This design reflects the reality that in global hiring, HR professionals often vary significantly in their level of understanding of technical roles.

<Figure 3. Results for the four selected groups based on TalentGPT usage and IT expertise>

Each participant was then randomly assigned to one of two conditions. Accordingly, as shown in the figure below, participants were recruited and the experiment was conducted across four groups considering the combinations of TalentGPT (RAG Usage) and IT knowledge (IT Knowledge).

One condition used the conventional manual keyword-based filtering approach; the other condition used an AI approach where filters are recommended based on a natural-language JD. Participants explored a candidate pool based on a given IT role Job Description and completed a task to select the final three suitable candidates.

<Figure 3. Real examples of the search environments participants faced depending on TalentGPT usage>

For outcome metrics, we focused on process-centered measures that matter in real HR practice. Specifically:
① dropout rate during the search and filter setup stage,
② time spent on search setup,
③ cognitive load during search (NASA-TLX based),
④ trust in the system and decision confidence.

In total, we selected these four measures and tested whether meaningful effects could be observed in the process of exploring suitable talent using TalentGPT, regardless of domain expertise.

Experimental results: Structurally absorbing the “expertise gap” in global hiring

The experiment results showed a consistent pattern across all major metrics. First, from an efficiency perspective, the AI-based filter recommendation system produced clear improvements compared to manual keyword filtering. Specifically, the group using RAG-based filter recommendations showed a time reduction of roughly 90–100+ seconds on average in the search setup stage (with 75.7 seconds being the statistically significant effect), and this effect was observed consistently regardless of whether participants had IT domain knowledge. This indicates that the efficiency gain is not limited to a specific user subgroup, but rather represents a universal efficiency effect that operates independent of user expertise.

This time reduction has particular significance in global and cross-border hiring. In global hiring, cognitive burden associated with role understanding, technical interpretation, and filter setup increases further due to country and language differences. The experiment results empirically demonstrate that an AI-based approach can absorb this burden at the system level without relying on individual expertise.

The second notable result was the task dropout rate. In the manual keyword filtering environment, about 14% of participants dropped out during the stage where they had to configure filters directly, whereas in the AI-based filter recommendation environment, this dropout phenomenon decreased substantially. As a result, the overall task completion rate increased by more than about 16 percentage points. This suggests that the filter setup stage itself had been acting as a structural barrier to entry for users, and that by removing this bottleneck, AI meaningfully improved task persistence.

This result becomes even more important in global hiring. In cross-border hiring, the judgment required during filter setup becomes more complex and uncertainty increases, making early dropout more likely. The experiment suggests that AI-based filter recommendations can structurally alleviate the problem that global talent exploration “fails to even start,” by removing the initial entry barrier.

Meanwhile, in the candidate exploration stage after search setup, an interesting result emerged. Behavioral exploration metrics—such as the number of candidate detail views, page navigation counts, and time spent reviewing candidates—did not show statistically significant differences depending on AI usage. This implies that the time saved through AI was not reinvested into expanding the depth or breadth of exploration; instead, it was used to shorten the overall task time.

This finding aligns with real decision-making strategies in global hiring. In global hiring contexts, HR often adopts a satisficing strategy—securing sufficiently satisfactory candidates quickly—rather than exploring indefinitely for an “optimal” candidate. AI-based filter recommendations appear to have reinforced this realistic decision strategy, and this can be interpreted as a result that fits practical global hiring contexts.

The cognitive load analysis also provides important implications. Across the full sample, AI-based filter recommendations showed an effect in the direction of reducing subjective cognitive load (mental demand and effort), and this effect remained even after controlling for age and gender. This suggests that by removing unnecessary clicks and manipulations, AI effectively reduced extrinsic cognitive load.

However, within the non-expert group, an interesting cognitive mismatch was observed. Even though search speed improved by more than about 2x with AI usage, the subjectively reported cognitive load did not decrease substantially. This can be interpreted as follows: while the extrinsic burden of manipulating filters decreased, intrinsic cognitive load—uncertainty about “what to search for”—still remained. In the experimental results, the KAIST researchers defined this phenomenon as load substitution.

Finally, the analysis of domain knowledge provides important implications for global hiring system design. Domain knowledge had a statistically meaningful effect on whether participants dropped out, but among users who completed the task, there were no meaningful differences in search efficiency, cognitive load, trust, or decision satisfaction. This suggests that domain knowledge may not directly improve performance; rather, it may function as a condition for overcoming the initial entry barrier. Once users enter the task, system design appears to be what drives performance.

Taken together, this experiment clearly shows that AI-based filter recommendations did not produce effects because AI replaced a domain expert’s judgment. Instead, AI enabled consistent efficiency improvements independent of user expertise by redesigning the structure so that expertise differences do not become unnecessary burdens in the global and cross-border hiring process. This can be viewed as empirical evidence that clarifies the role HR AI should play in a global expansion phase.

**<Figure 4. Summary table of the KAIST joint experiment results>**

What should TalentGPT do going forward to “practically” support cross-border hiring in global expansion?

The implications of this study and experiment results are clear. Future HR AI should not assume that all users can become domain experts. In an environment where the pace of technological change is fast and differentiation of roles and skills continues to accelerate, it is structurally impossible for all HR professionals to build sufficient expertise across every country, industry, and technical domain. Instead, HR AI should play the role of absorbing the burden of judgment so that expertise gaps do not become performance gaps.

This need becomes even more evident in a global environment where cross-border hiring becomes routine. Today, to maintain global competitiveness, companies must solve three tasks simultaneously.
First, they must secure a sufficiently broad global talent pool not limited to a single country or region.
Second, they must quickly screen candidates who fit the company and role among candidates with different backgrounds and different resume structures.
Third, beyond technical fit, they must judge whether candidates align with local context and role expectations as well.

However, relying solely on individual HR domain knowledge and experience to make all these judgments is no longer sustainable. In global hiring, lack of domain knowledge is likely to lead to search failure, excessive time consumption, or dropout at early stages. This is not a problem that occurs because talent does not exist; it occurs because the judgment structure is designed to depend excessively on individual capability.

This experiment both reveals these limitations clearly and empirically shows how AI can complement them. AI-based filter recommendations reduced the burden of search and judgment for both high-expertise and low-expertise users, and for users with lower expertise in particular, it showed an effect that meaningfully alleviated the initial entry barrier in global hiring. This was not because AI replaced experts’ judgment; it was because AI redesigned the structure of search and filtering so that differences in expertise do not operate as unnecessary costs.

Based on these results, TalentGPT defines the role of AI not as a “decision surrogate,” but as an assistant that enables judgment. TalentGPT is not a system that makes decisions in place of HR; it is a system that organizes the structure of exploration and removes unnecessary burdens so that HR can judge more consistently and efficiently even in global environments. This is a practical approach that keeps HR responsibility and control intact while enabling organizations to handle the complexity of global hiring.

Ultimately, the value of HR AI is not in the size of the model or technical sophistication. What matters is how well it supports HR’s fundamental goal—finding suitable talent quickly—under real constraints of global and cross-border hiring. Through a KAIST joint experiment, TalentGPT validated that this role is not merely a conceptual claim, but can be implemented in practice and is effective. This provides a clear benchmark for where global HR technology should head.

At the same time, these results do not mean the problem is fully solved. Global and cross-border hiring environments will continue to change, and the definition of roles and competencies, as well as requirements for local context, will become more precise. TalentGPT will treat this experiment as a starting point, continuously observe the judgment difficulties HR actually faces across diverse countries and industry contexts, and improve the system in ways that structurally alleviate those difficulties. In a reality where not all HR professionals can become domain experts, TalentGPT will continue responsible technology development grounded in empirical validation—so that global talent exploration and judgment can become more fair and efficient, and so that cross-border hiring can become an option accessible to broader organizations rather than only to a small set of experts.

Try talentseeker for Free

References

World Economic Forum (WEF). (2024). The Future of Jobs Report 2024. World Economic Forum, Geneva.
OECD. (2023). OECD Employment Outlook 2023: Artificial Intelligence and the Labour Market. Organisation for Economic Co-operation and Development, Paris.
LinkedIn Economic Graph. (2024). Global Hiring Trends and Cross-Border Talent Mobility. LinkedIn Corporation.
McKinsey & Company. (2023). The Global Talent Crunch: Rethinking Hiring in a Borderless World. McKinsey Global Institute.
Gartner. (2024). HR Technology Trends 2024: From Automation to Decision Enablement. Gartner Research.
Dietvorst, B. J., Simmons, J. P., & Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114–126.
Lacroux, A., & Martin-Lacroux, C. (2022). Algorithmic management, transparency, and trust in human–AI decision-making. Human Resource Management Review, 32(4), 100860.
Ahn, M., Kim, J., & Sung, Y. (2021). The effect of explainable AI on trust and decision confidence in human–AI collaboration. International Journal of Human–Computer Studies, 150, 102608.
Hosanagar, K., & Lee, D. (2023). Human–AI interaction and the future of decision support systems. Management Science, 69(1), 1–17.
Wall Street Journal. (2025). Why Companies Struggle to Find the Talent They Want in a Sea of Skills. The Wall Street Journal.

How to Write Better Job Descriptions with Generative AI

Why Working with Headhunters Gets Stuck: Your Job Description Might Be the Problem

Advanced Job Description Design: How Top Hiring Teams Write JDs

Why Job Descriptions Are a Structuring Problem, Not a Generation Problem

A Complete JD Guide with Checklist and Template for Recruiters

Thank you for your response. ✨

See it. Try it. Hire smarter.