Git - Attributes

Advanced

Git attributes provide fine-grained control over how Git handles files in your repository. Through the .gitattributes file, you can customize line ending handling, apply filters, define merge strategies, and more.

What are Git Attributes?

Git attributes are per-path settings that control Git's behavior for specific files or patterns:

# .gitattributes file format
pattern attr1 attr2=value -attr3

# Examples
*.txt text
*.jpg binary
*.js text eol=lf
*.md merge=union

Attributes can be:

  • Set: `attribute` (boolean true)
  • Unset: `-attribute` (boolean false)
  • Set to value: `attribute=value`
  • Unspecified: Not mentioned (uses defaults)

The .gitattributes File

File Locations

# Repository-specific (committed with project)
.gitattributes

# Global (per user)  
~/.config/git/attributes

# System-wide (all users)
$(prefix)/etc/gitattributes

# Info attributes (not committed)
.git/info/attributes

Pattern Matching

# Exact filename
README.md text

# Wildcards
*.txt text
*.* text
**/*.js text

# Directory patterns  
docs/ text
src/**/*.py text

# Negation
*.* text
!*.jpg
!*.png

Text and Line Ending Attributes

Text Attribute

# Mark files as text
*.txt text
*.md text  
*.json text
*.yml text

# Mark files as binary
*.jpg binary
*.png binary
*.exe binary

# Auto-detect (default)
*.unknown auto

End-of-Line (EOL) Handling

# Platform-specific line endings
*.txt text eol=crlf    # Windows (CRLF)
*.sh text eol=lf       # Unix/Linux (LF)
*.bat text eol=crlf    # Windows batch files

# Cross-platform projects
* text=auto
*.txt text
*.md text
*.js text eol=lf
*.json text eol=lf

# Platform-specific files
*.sh text eol=lf
*.bat text eol=crlf
*.ps1 text eol=crlf

Complete Cross-Platform Setup

# .gitattributes for cross-platform project
# Set default behavior for all files
* text=auto

# Text files that should have consistent LF line endings
*.js text eol=lf
*.jsx text eol=lf
*.ts text eol=lf
*.tsx text eol=lf
*.json text eol=lf
*.md text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
*.xml text eol=lf
*.html text eol=lf
*.css text eol=lf
*.scss text eol=lf

# Shell scripts need LF
*.sh text eol=lf
*.bash text eol=lf

# Windows files need CRLF
*.bat text eol=crlf
*.cmd text eol=crlf
*.ps1 text eol=crlf

# Binary files
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
*.pdf binary
*.zip binary
*.tar.gz binary
*.exe binary
*.dll binary
*.so binary

Filters

Filters transform file contents during checkout and checkin:

Clean and Smudge Filters

# Configure filters
git config filter.keywords.clean 'sed "s/Date: .*/Date: $(date)/"'
git config filter.keywords.smudge 'cat'

# Apply filter to files
*.template filter=keywords

Practical Filter Examples

# Remove sensitive data before commit
git config filter.secrets.clean 'sed "s/password=.*/password=HIDDEN/"'
git config filter.secrets.smudge 'cat'

# Format JSON files
git config filter.json.clean 'jq .'
git config filter.json.smudge 'cat'

# Apply filters
*.config filter=secrets
*.json filter=json

Advanced Filter Script

#!/bin/bash
# .git/filters/clean-secrets.sh

# Remove API keys and passwords
sed -e 's/api_key=.*/api_key=REDACTED/' \
    -e 's/password=.*/password=REDACTED/' \
    -e 's/"token":\s*"[^"]*"/"token": "REDACTED"/'
# Configure advanced filter
git config filter.secrets.clean '.git/filters/clean-secrets.sh'
git config filter.secrets.smudge 'cat'

# Apply to configuration files
*.env filter=secrets
config/*.json filter=secrets

Merge Strategies

Built-in Merge Drivers

# Union merge - combine all changes
CHANGELOG.md merge=union

# Keep ours - always prefer current branch
*.generated merge=ours

# Binary merge - don't attempt text merge
*.db merge=binary

Custom Merge Drivers

# Configure custom merge driver
git config merge.npm-merge.name "NPM package.json merge"
git config merge.npm-merge.driver "npm-merge-driver %O %A %B %L"

# Apply to package.json
package.json merge=npm-merge
package-lock.json merge=npm-merge

Merge Driver for Different File Types

# Database schema files - manual merge required
*.sql merge=manual
git config merge.manual.driver false

# Documentation - union merge
*.md merge=union
docs/*.rst merge=union

# Generated files - always use ours
dist/* merge=ours
build/* merge=ours

Diff Drivers

Custom Diff Handling

# Configure diff drivers
git config diff.word.textconv "strings"
git config diff.json.textconv "jq ."

# Apply to files
*.doc diff=word
*.json diff=json
*.pdf diff=pdf

Image Diff

# Configure image diff
git config diff.image.textconv "identify"

# Apply to image files  
*.png diff=image
*.jpg diff=image
*.gif diff=image

Export Attributes

Archive Exclusion

# Exclude from git archive
.gitignore export-ignore
.gitattributes export-ignore
tests/ export-ignore
docs/ export-ignore
*.test.js export-ignore

# Include in archives despite .gitignore
important-file.tmp -export-ignore

Export Substitution

# Substitute keywords in archives
version.txt export-subst

# In version.txt file:
# Version: $Format:%H$
# Date: $Format:%ci$

Working Tree Encoding

# Handle different text encodings
*.txt working-tree-encoding=UTF-8
legacy/*.txt working-tree-encoding=ISO-8859-1
windows/*.txt working-tree-encoding=UTF-16

Real-World Examples

Web Development Project

# .gitattributes for web project
* text=auto

# Source code
*.html text eol=lf
*.css text eol=lf
*.js text eol=lf
*.ts text eol=lf
*.json text eol=lf
*.md text eol=lf

# Configuration
*.yml text eol=lf
*.yaml text eol=lf
*.toml text eol=lf
*.ini text eol=lf

# Templates
*.hbs text eol=lf
*.mustache text eol=lf

# Assets (binary)
*.png binary
*.jpg binary
*.gif binary
*.ico binary
*.svg binary
*.woff binary
*.woff2 binary
*.ttf binary
*.eot binary

# Archives
*.zip binary
*.tar.gz binary

# Build artifacts (exclude from export)
dist/ export-ignore
node_modules/ export-ignore
.env export-ignore
*.log export-ignore

# Package files (custom merge)
package.json merge=npm-merge
package-lock.json merge=ours

Data Science Project

# .gitattributes for data science
* text=auto

# Code files
*.py text eol=lf
*.R text eol=lf
*.sql text eol=lf
*.sh text eol=lf

# Data files
*.csv text eol=lf
*.tsv text eol=lf
*.json text eol=lf
*.xml text eol=lf

# Notebooks  
*.ipynb text eol=lf merge=union

# Binary data
*.pkl binary
*.h5 binary
*.parquet binary
*.arrow binary

# Images and plots
*.png binary diff=image
*.jpg binary diff=image
*.pdf binary

# Configuration
*.yml text eol=lf
*.yaml text eol=lf
*.toml text eol=lf

# Exclude from archives
data/ export-ignore
models/ export-ignore
.ipynb_checkpoints/ export-ignore
__pycache__/ export-ignore

Mobile App Project

# .gitattributes for mobile development
* text=auto

# Source code
*.swift text eol=lf
*.m text eol=lf
*.h text eol=lf
*.java text eol=lf
*.kt text eol=lf
*.xml text eol=lf
*.json text eol=lf

# Project files
*.xcodeproj/** text eol=lf
*.pbxproj text eol=lf merge=union
*.storyboard text eol=lf
*.xib text eol=lf

# Android
*.gradle text eol=lf
*.properties text eol=lf
*.xml text eol=lf

# Assets
*.png binary
*.jpg binary
*.gif binary
*.ttf binary
*.otf binary

# Audio/Video
*.mp3 binary
*.mp4 binary
*.wav binary

# Exclude build artifacts
build/ export-ignore
*.app export-ignore
*.ipa export-ignore
*.apk export-ignore

Testing and Validation

Check Current Attributes

# Check attributes for specific file
git check-attr --all README.md

# Check specific attribute
git check-attr text *.js

# Check multiple files
git check-attr eol src/*.js
# Example output
README.md: text: set
README.md: eol: lf
src/app.js: text: set
src/app.js: eol: lf

Validate Line Endings

# Check for CRLF line endings
git ls-files --eol

# Find files with mixed line endings
git ls-files | xargs file | grep CRLF

Troubleshooting

Line Ending Issues

Mixed Line Endings: Can cause issues in cross-platform development.
# Fix line endings after adding .gitattributes
git add --renormalize .
git commit -m "Normalize line endings"

# Check for files that will be modified
git status --porcelain

Filter Issues

# Debug filter execution
git config filter.debug.clean 'tee /tmp/clean.log'
git config filter.debug.smudge 'tee /tmp/smudge.log'

# Test filter manually
echo "test content" | git config filter.secrets.clean

Merge Driver Issues

# Test merge driver
git config merge.test.driver 'echo "Merge: %A %B" && false'

# Reset merge driver if problematic
git config --unset merge.problematic.driver

Best Practices

Best Practices:
  • Commit .gitattributes to ensure consistent behavior across team
  • Use text=auto as default, then specify exceptions
  • Test attributes with git check-attr before committing
  • Document custom filters and merge drivers
  • Be conservative with filters - they can complicate collaboration

Team Workflow

# Team setup script
#!/bin/bash
# setup-git-attributes.sh

# Copy team .gitattributes
cp .gitattributes.template .gitattributes

# Configure team filters
git config filter.secrets.clean '.git/hooks/clean-secrets'
git config filter.secrets.smudge 'cat'

# Configure merge drivers
git config merge.npm.driver 'npm-merge-driver %O %A %B'

echo "Git attributes configured for team workflow"

Integration with CI/CD

# CI validation script
#!/bin/bash
# validate-attributes.sh

# Check for consistent line endings
echo "Checking line endings..."
git ls-files --eol | grep -E "(w/crlf|w/mixed)" && exit 1

# Validate no secrets in repository
echo "Checking for secrets..."
git grep -E "(password|api_key|secret)" && exit 1

# Check attributes are applied correctly
echo "Validating attributes..."
git check-attr --all $(git ls-files) | grep "unspecified" && exit 1

echo "All attribute validations passed"

Performance Considerations

  • Filters: Can slow down operations, especially on large files
  • Line Ending Conversion: Adds overhead during checkout/commit
  • Custom Merge Drivers: May increase merge time
  • Binary Detection: Automatic detection requires file inspection

Key Takeaways

  • Fine-grained Control: Customize Git behavior per file or pattern
  • Cross-platform Consistency: Ensure consistent line endings and text handling
  • Automation: Use filters for automatic content transformation
  • Merge Customization: Control how different file types merge
  • Team Collaboration: Share consistent repository behavior

Git attributes provide powerful customization capabilities for repository behavior. Master these tools to ensure consistent cross-platform development and automated content management.