User Ingestion Processors Guide
Overview
Processors are optional data transformation steps in your User Import Flow. They clean, filter, and enhance user data during ingestion from source systems (Okta, Active Directory, Workday, etc.).
Note: Most processors operate on individual source data before merging, unless specifically stated otherwise. Processors execute in the order listed.
Table of Contents
- Quick Reference
- How to Configure
- Available Processors
- Best Practices
- Common Scenarios
- Troubleshooting
- Rule Syntax Reference
- Limitations
Quick Reference
How to Configure
Navigation
- Navigate to Import Users
- Select your source(s) on the Connectors page
- Proceed to Configure Selected Sources
- Click Advanced Mode
In Advanced Mode
Processors to Apply: Add transformation processors that run during ingestion. Processors execute in the order listed.
Filter and Attribute List: Control which records and fields are imported at the source level before any processors run.
Available Processors
Filter Users by Field Value
Processor Name: User Filter Processor
Excludes users from ingestion when a specified field matches any value in your exclusion list. This is a simple, single-field filter that performs exact matching.
Use Cases: Exclude terminated employees, contractors, test accounts, or specific departments based on a single field value.
Configuration:
Examples:
Filter Users by Rule
Processor Name: Filter Rule Post Processor
Excludes users from ingestion based on complex conditional logic. Unlike the simple Field Value filter, this processor allows you to combine multiple field conditions using AND/OR logic, perform date comparisons, and apply sophisticated filtering rules.
Use Cases: Apply multi-condition filtering (e.g., āactive AND hired after dateā), date-based filtering, or any logic requiring multiple field comparisons.
Configuration:
Examples:
š” Tip: See Rule Syntax Reference for complete syntax.
Remove Duplicate Users
Processor Name: DSL First Match Dedupe Processor
When the same user appears multiple times (identified by your Index Key, typically email), this processor evaluates all duplicate records and keeps only the first one that matches your filter condition. All other duplicates are discarded. This operates across all sources after theyāre merged together.
Use Cases: Multiple integrations provide overlapping users, need to select which sourceās data to prioritize, ensure each user appears only once in final roster.
ā ļø Note: Can be attached to any source - operates on merged data from all sources after ingestion.
Configuration:
Common Rules:
ā ļø Important: Always set Lowercase to
truewhen usingemail_addras Index Key.
Set User Timezone
Processor Name: User Timezone Processor
Automatically infers and populates the userās timezone field by analyzing their location information (city, state, country). The processor uses geographic data to determine the most likely timezone for each userās location.
Use Cases: Source system doesnāt provide timezone field, need consistent timezone data for time-based notifications and scheduling.
Configuration: No configuration needed - just add the processor. It automatically reads from standard location fields.
Calculate Password Expiration
Processor Name: User Password Meta Info Processor
Fills in missing password date information using your organizationās password policy configuration. This processor operates on two fields in the user record: password_last_changed and password_expires.
What Fields It Uses:
- Input fields:
password_meta_info.password_last_changed(date),password_meta_info.password_expires(date) - Password policy: Uses your orgās configured
password_expiry_in_dayssetting - Output: Populates whichever field is missing
How It Works:
Configuration:
Use Cases:
- Source provides only one of the two password fields
- Need complete password data for expiry notifications and password reset workflows
- Source system password policy differs slightly from Moveworks org configuration
Add Location Coordinates
Processor Name: User Geocode Processor
Enriches user records with geographic coordinates (latitude/longitude) by geocoding their location information. The processor constructs a location query from specified fields, sends it to a geocoding service, and adds the resulting coordinates to the userās geocodes field.
What Fields It Uses:
- Input: Any combination of location fields you specify (typically
country_code,state,city) - Output: Populates
geocodesfield with latitude/longitude data
Use Cases:
- Enable location-based analytics and reporting
- Support features that require geographic coordinates
- Enrich user profiles with precise location data
ā ļø Important: Attach to the source that contains the location fields you want to geocode.
Performance Note: Makes external API calls for geocoding - may slow ingestion for large user sets.
Configuration:
Resolve Manager Relationships
Processor Name: Unified Resolve Manager Processor
Establishes manager-employee relationships by resolving manager email addresses to internal user IDs. This processor builds an index of all users (email ā ID), then replaces each userās manager_email field value with the corresponding managerās internal ID, enabling proper organizational hierarchy.
What Fields It Uses:
- Input:
manager_email(managerās email address) - Index built from:
email_addr(all usersā emails) - Output: Replaces
manager_emailvalue with managerās internal identifier
How It Works:
- Builds an index mapping every userās email address to their internal ID
- For each user record, looks up their
manager_emailin the index - Replaces the email with the managerās internal ID
- Result: Proper manager-employee links throughout the organization
Use Cases:
- Source provides manager email instead of manager ID
- Need to build organizational reporting hierarchy
- Manager data comes from different source than employee data
ā ļø Note: Can be attached to any source - operates on all users after merge. Add AFTER deduplication to ensure manager links resolve correctly.
Configuration: No configuration needed - just add the processor.
Best Practices
1. Filter Early
Add filter processors before enrichment (like geocoding) to reduce processing time.
2. Deduplicate Before Manager Resolution
If using both processors, always apply deduplication first.
3. Use Lowercase for Email Deduplication
When deduplicating by email, always set Lowercase to true.
4. Attach Geocode to Source with Location Data
Add the geocode processor to the source that has location fields (country_code, state, city).
5. Test with Sample Data First
- Configure processor on test integration
- Run ingestion with small sample
- Verify results match expectations
- Apply to production
Common Scenarios
Scenario 1: Basic Filtering
Goal: Exclude terminated and inactive users from Okta.
Steps:
- Import Users ā Select Okta ā Advanced Mode
- In Processors to Apply, add: Filter Users by Field Value
- Configure: Filter Key:
employment_status, Filter List:Terminated, Inactive
Scenario 2: Multi-Source with Deduplication
Goal: Use both Okta and Workday, preferring records with employee IDs.
Okta Source:
- Import Users ā Select Okta ā Advanced Mode
- Add: Set User Timezone
Workday Source:
- Import Users ā Select Workday ā Advanced Mode
- Add: Filter Users by Field Value
- Filter Key:
worker_type, Filter List:Contractor, Temp
- Filter Key:
Either Source (Deduplication):
- Add: Remove Duplicate Users
- Index Key:
email_addr - Filter Condition:
record.employee_id != "" - Lowercase:
true
- Index Key:
Scenario 3: Complex Filtering
Goal: Keep only active, full-time employees with company email addresses.
Steps:
- Import Users ā Select source ā Advanced Mode
- Add: Filter Users by Rule
- Configure Filter Condition:
Scenario 4: Manager Hierarchy
Goal: Establish manager relationships when source provides manager emails.
Steps:
- Import Users ā Select any source ā Advanced Mode
- Add: Resolve Manager Relationships (no configuration needed)
Note: Add AFTER any deduplication processors.
Troubleshooting
ā Too many users filtered out
Solution:
- Review filter conditions and test with small sample
- Verify field names match source data exactly (case-sensitive)
- Check logical operators match intent (AND vs OR)
ā Duplicate users still appearing
Check:
- ā Lowercase set to
truefor email-based deduplication - ā Index Key matches field name exactly (case-sensitive)
- ā Filter condition correctly identifies preferred record
ā Manager relationships not working
Check:
- ā Manager processor added AFTER deduplication
- ā Manager emails exist in ingested user data
- ā Manager email field populated in source data
ā Rule syntax error
Check:
- ā Field names match exactly (case-sensitive)
- ā Strings in quotes:
"value"notvalue - ā Lists use brackets:
["value1", "value2"]
Rule Syntax Reference
Filter Users by Rule
Direct field names, no prefix needed.
Basic Comparisons
List Operations
Text Matching
Combining Conditions
Examples
Remove Duplicate Users
Uses record. prefix to access fields.
Basic Comparisons
List Operations
Text Matching
Examples
Limitations
Processor Limits:
- Maximum 20 processors per integration source
- Processors run in configured order
- No processor loops or conditional execution
Rule Constraints:
- Field names are case-sensitive
- Changes require running ingestion to take effect
Performance Considerations:
- Geocoding processors make external API calls (slower)
- Large filter lists may impact performance
- Test with sample data before full ingestion