Git - Attributes
Advanced
Git attributes provide fine-grained control over how Git handles files in your repository. Through the .gitattributes file, you can customize line ending handling, apply filters, define merge strategies, and more.
What are Git Attributes?
Git attributes are per-path settings that control Git's behavior for specific files or patterns:
# .gitattributes file format
pattern attr1 attr2=value -attr3
# Examples
*.txt text
*.jpg binary
*.js text eol=lf
*.md merge=union
Attributes can be:
- Set: `attribute` (boolean true)
- Unset: `-attribute` (boolean false)
- Set to value: `attribute=value`
- Unspecified: Not mentioned (uses defaults)
The .gitattributes File
File Locations
# Repository-specific (committed with project)
.gitattributes
# Global (per user)
~/.config/git/attributes
# System-wide (all users)
$(prefix)/etc/gitattributes
# Info attributes (not committed)
.git/info/attributes
Pattern Matching
# Exact filename
README.md text
# Wildcards
*.txt text
*.* text
**/*.js text
# Directory patterns
docs/ text
src/**/*.py text
# Negation
*.* text
!*.jpg
!*.png
Text and Line Ending Attributes
Text Attribute
# Mark files as text
*.txt text
*.md text
*.json text
*.yml text
# Mark files as binary
*.jpg binary
*.png binary
*.exe binary
# Auto-detect (default)
*.unknown auto
End-of-Line (EOL) Handling
# Platform-specific line endings
*.txt text eol=crlf # Windows (CRLF)
*.sh text eol=lf # Unix/Linux (LF)
*.bat text eol=crlf # Windows batch files
# Cross-platform projects
* text=auto
*.txt text
*.md text
*.js text eol=lf
*.json text eol=lf
# Platform-specific files
*.sh text eol=lf
*.bat text eol=crlf
*.ps1 text eol=crlf
Complete Cross-Platform Setup
# .gitattributes for cross-platform project
# Set default behavior for all files
* text=auto
# Text files that should have consistent LF line endings
*.js text eol=lf
*.jsx text eol=lf
*.ts text eol=lf
*.tsx text eol=lf
*.json text eol=lf
*.md text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
*.xml text eol=lf
*.html text eol=lf
*.css text eol=lf
*.scss text eol=lf
# Shell scripts need LF
*.sh text eol=lf
*.bash text eol=lf
# Windows files need CRLF
*.bat text eol=crlf
*.cmd text eol=crlf
*.ps1 text eol=crlf
# Binary files
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
*.pdf binary
*.zip binary
*.tar.gz binary
*.exe binary
*.dll binary
*.so binary
Filters
Filters transform file contents during checkout and checkin:
Clean and Smudge Filters
# Configure filters
git config filter.keywords.clean 'sed "s/Date: .*/Date: $(date)/"'
git config filter.keywords.smudge 'cat'
# Apply filter to files
*.template filter=keywords
Practical Filter Examples
# Remove sensitive data before commit
git config filter.secrets.clean 'sed "s/password=.*/password=HIDDEN/"'
git config filter.secrets.smudge 'cat'
# Format JSON files
git config filter.json.clean 'jq .'
git config filter.json.smudge 'cat'
# Apply filters
*.config filter=secrets
*.json filter=json
Advanced Filter Script
#!/bin/bash
# .git/filters/clean-secrets.sh
# Remove API keys and passwords
sed -e 's/api_key=.*/api_key=REDACTED/' \
-e 's/password=.*/password=REDACTED/' \
-e 's/"token":\s*"[^"]*"/"token": "REDACTED"/'
# Configure advanced filter
git config filter.secrets.clean '.git/filters/clean-secrets.sh'
git config filter.secrets.smudge 'cat'
# Apply to configuration files
*.env filter=secrets
config/*.json filter=secrets
Merge Strategies
Built-in Merge Drivers
# Union merge - combine all changes
CHANGELOG.md merge=union
# Keep ours - always prefer current branch
*.generated merge=ours
# Binary merge - don't attempt text merge
*.db merge=binary
Custom Merge Drivers
# Configure custom merge driver
git config merge.npm-merge.name "NPM package.json merge"
git config merge.npm-merge.driver "npm-merge-driver %O %A %B %L"
# Apply to package.json
package.json merge=npm-merge
package-lock.json merge=npm-merge
Merge Driver for Different File Types
# Database schema files - manual merge required
*.sql merge=manual
git config merge.manual.driver false
# Documentation - union merge
*.md merge=union
docs/*.rst merge=union
# Generated files - always use ours
dist/* merge=ours
build/* merge=ours
Diff Drivers
Custom Diff Handling
# Configure diff drivers
git config diff.word.textconv "strings"
git config diff.json.textconv "jq ."
# Apply to files
*.doc diff=word
*.json diff=json
*.pdf diff=pdf
Image Diff
# Configure image diff
git config diff.image.textconv "identify"
# Apply to image files
*.png diff=image
*.jpg diff=image
*.gif diff=image
Export Attributes
Archive Exclusion
# Exclude from git archive
.gitignore export-ignore
.gitattributes export-ignore
tests/ export-ignore
docs/ export-ignore
*.test.js export-ignore
# Include in archives despite .gitignore
important-file.tmp -export-ignore
Export Substitution
# Substitute keywords in archives
version.txt export-subst
# In version.txt file:
# Version: $Format:%H$
# Date: $Format:%ci$
Working Tree Encoding
# Handle different text encodings
*.txt working-tree-encoding=UTF-8
legacy/*.txt working-tree-encoding=ISO-8859-1
windows/*.txt working-tree-encoding=UTF-16
Real-World Examples
Web Development Project
# .gitattributes for web project
* text=auto
# Source code
*.html text eol=lf
*.css text eol=lf
*.js text eol=lf
*.ts text eol=lf
*.json text eol=lf
*.md text eol=lf
# Configuration
*.yml text eol=lf
*.yaml text eol=lf
*.toml text eol=lf
*.ini text eol=lf
# Templates
*.hbs text eol=lf
*.mustache text eol=lf
# Assets (binary)
*.png binary
*.jpg binary
*.gif binary
*.ico binary
*.svg binary
*.woff binary
*.woff2 binary
*.ttf binary
*.eot binary
# Archives
*.zip binary
*.tar.gz binary
# Build artifacts (exclude from export)
dist/ export-ignore
node_modules/ export-ignore
.env export-ignore
*.log export-ignore
# Package files (custom merge)
package.json merge=npm-merge
package-lock.json merge=ours
Data Science Project
# .gitattributes for data science
* text=auto
# Code files
*.py text eol=lf
*.R text eol=lf
*.sql text eol=lf
*.sh text eol=lf
# Data files
*.csv text eol=lf
*.tsv text eol=lf
*.json text eol=lf
*.xml text eol=lf
# Notebooks
*.ipynb text eol=lf merge=union
# Binary data
*.pkl binary
*.h5 binary
*.parquet binary
*.arrow binary
# Images and plots
*.png binary diff=image
*.jpg binary diff=image
*.pdf binary
# Configuration
*.yml text eol=lf
*.yaml text eol=lf
*.toml text eol=lf
# Exclude from archives
data/ export-ignore
models/ export-ignore
.ipynb_checkpoints/ export-ignore
__pycache__/ export-ignore
Mobile App Project
# .gitattributes for mobile development
* text=auto
# Source code
*.swift text eol=lf
*.m text eol=lf
*.h text eol=lf
*.java text eol=lf
*.kt text eol=lf
*.xml text eol=lf
*.json text eol=lf
# Project files
*.xcodeproj/** text eol=lf
*.pbxproj text eol=lf merge=union
*.storyboard text eol=lf
*.xib text eol=lf
# Android
*.gradle text eol=lf
*.properties text eol=lf
*.xml text eol=lf
# Assets
*.png binary
*.jpg binary
*.gif binary
*.ttf binary
*.otf binary
# Audio/Video
*.mp3 binary
*.mp4 binary
*.wav binary
# Exclude build artifacts
build/ export-ignore
*.app export-ignore
*.ipa export-ignore
*.apk export-ignore
Testing and Validation
Check Current Attributes
# Check attributes for specific file
git check-attr --all README.md
# Check specific attribute
git check-attr text *.js
# Check multiple files
git check-attr eol src/*.js
# Example output
README.md: text: set
README.md: eol: lf
src/app.js: text: set
src/app.js: eol: lf
Validate Line Endings
# Check for CRLF line endings
git ls-files --eol
# Find files with mixed line endings
git ls-files | xargs file | grep CRLF
Troubleshooting
Line Ending Issues
Mixed Line Endings: Can cause issues in cross-platform development.
# Fix line endings after adding .gitattributes
git add --renormalize .
git commit -m "Normalize line endings"
# Check for files that will be modified
git status --porcelain
Filter Issues
# Debug filter execution
git config filter.debug.clean 'tee /tmp/clean.log'
git config filter.debug.smudge 'tee /tmp/smudge.log'
# Test filter manually
echo "test content" | git config filter.secrets.clean
Merge Driver Issues
# Test merge driver
git config merge.test.driver 'echo "Merge: %A %B" && false'
# Reset merge driver if problematic
git config --unset merge.problematic.driver
Best Practices
Best Practices:
- Commit .gitattributes to ensure consistent behavior across team
- Use text=auto as default, then specify exceptions
- Test attributes with git check-attr before committing
- Document custom filters and merge drivers
- Be conservative with filters - they can complicate collaboration
Team Workflow
# Team setup script
#!/bin/bash
# setup-git-attributes.sh
# Copy team .gitattributes
cp .gitattributes.template .gitattributes
# Configure team filters
git config filter.secrets.clean '.git/hooks/clean-secrets'
git config filter.secrets.smudge 'cat'
# Configure merge drivers
git config merge.npm.driver 'npm-merge-driver %O %A %B'
echo "Git attributes configured for team workflow"
Integration with CI/CD
# CI validation script
#!/bin/bash
# validate-attributes.sh
# Check for consistent line endings
echo "Checking line endings..."
git ls-files --eol | grep -E "(w/crlf|w/mixed)" && exit 1
# Validate no secrets in repository
echo "Checking for secrets..."
git grep -E "(password|api_key|secret)" && exit 1
# Check attributes are applied correctly
echo "Validating attributes..."
git check-attr --all $(git ls-files) | grep "unspecified" && exit 1
echo "All attribute validations passed"
Performance Considerations
- Filters: Can slow down operations, especially on large files
- Line Ending Conversion: Adds overhead during checkout/commit
- Custom Merge Drivers: May increase merge time
- Binary Detection: Automatic detection requires file inspection
Key Takeaways
- Fine-grained Control: Customize Git behavior per file or pattern
- Cross-platform Consistency: Ensure consistent line endings and text handling
- Automation: Use filters for automatic content transformation
- Merge Customization: Control how different file types merge
- Team Collaboration: Share consistent repository behavior
Git attributes provide powerful customization capabilities for repository behavior. Master these tools to ensure consistent cross-platform development and automated content management.