Vc
1.3.2-dev
SIMD Vector Classes for C++
|
|
Additional classes, macros, and functions that help to work more easily with the main vector types.
Classes | |
class | CpuId |
This class is available for x86 / AMD64 systems to read and interpret information about the CPU's capabilities. More... | |
struct | ImplementationT< Features > |
This class identifies the specific implementation Vc uses in the current translation unit in terms of a type. More... | |
class | Allocator< T > |
An allocator that uses global new and supports over-aligned types, as per [C++11 20.6.9]. More... | |
struct | AlignedBase< Alignment > |
Helper class to ensure a given alignment. More... | |
class | InterleavedMemoryWrapper< S, V > |
Wraps a pointer to memory with convenience functions to access it via vectors. More... | |
class | Memory< V, Size1, Size2, InitPadding > |
A helper class for fixed-size two-dimensional arrays. More... | |
class | Memory< V, Size, 0u, InitPadding > |
A helper class to simplify usage of correctly aligned and padded memory, allowing both vector and scalar access. More... | |
class | Memory< V, 0u, 0u, true > |
A helper class that is very similar to Memory<V, Size> but with dynamically allocated memory and thus dynamic size. More... | |
Macros | |
#define | Vc_DECLARE_ALLOCATOR(Type) |
Convenience macro to set the default allocator for a given Type to Vc::Allocator. More... | |
Typedefs | |
using | CurrentImplementation = ImplementationT< > |
Identifies the Vc implementation used in the current translation unit. More... | |
template<typename T , typename Allocator = std::allocator<T>> | |
using | vector = Common::AdaptSubscriptOperator< std::vector< T, Allocator >> |
An adapted std::vector container with an additional subscript operator which implements gather and scatter operations. More... | |
using | VectorAlignedBase = AlignedBase< Detail::max(alignof(Vector< float >), alignof(Vector< double >), alignof(Vector< ullong >), alignof(Vector< llong >), alignof(Vector< ulong >), alignof(Vector< long >), alignof(Vector< uint >), alignof(Vector< int >), alignof(Vector< ushort >), alignof(Vector< short >), alignof(Vector< uchar >), alignof(Vector< schar >))> |
Helper type to ensure suitable alignment for any Vc::Vector<T> type (using the default VectorAbi). More... | |
template<typename V > | |
using | VectorAlignedBaseT = AlignedBase< alignof(V)> |
Variant of the above type ensuring suitable alignment only for the specified vector type V . More... | |
using | MemoryAlignedBase = AlignedBase< Detail::max(Vector< float >::MemoryAlignment, Vector< double >::MemoryAlignment, Vector< ullong >::MemoryAlignment, Vector< llong >::MemoryAlignment, Vector< ulong >::MemoryAlignment, Vector< long >::MemoryAlignment, Vector< uint >::MemoryAlignment, Vector< int >::MemoryAlignment, Vector< ushort >::MemoryAlignment, Vector< short >::MemoryAlignment, Vector< uchar >::MemoryAlignment, Vector< schar >::MemoryAlignment)> |
Helper class to ensure suitable alignment for arrays of scalar objects for any Vc::Vector<T> type (using the default VectorAbi). More... | |
template<typename V > | |
using | MemoryAlignedBaseT = AlignedBase< V::MemoryAlignment > |
Variant of the above type ensuring suitable alignment only for the specified vector type V . More... | |
using | llong = long long |
long long shorthand | |
using | ullong = unsigned long long |
unsigned long long shorthand | |
using | ulong = unsigned long |
unsigned long shorthand | |
using | uint = unsigned int |
unsigned int shorthand | |
using | ushort = unsigned short |
unsigned short shorthand | |
using | uchar = unsigned char |
unsigned char shorthand | |
using | schar = signed char |
signed char shorthand | |
Enumerations | |
enum | MallocAlignment { AlignOnVector, AlignOnCacheline, AlignOnPage } |
Enum that specifies the alignment and padding restrictions to use for memory allocation with Vc::malloc. More... | |
enum | Implementation : std::uint_least32_t { ScalarImpl, SSE2Impl, SSE3Impl, SSSE3Impl, SSE41Impl, SSE42Impl, AVXImpl, AVX2Impl, MICImpl } |
Enum to identify a certain SIMD instruction set. More... | |
enum | ExtraInstructions : std::uint_least32_t { Float16cInstructions = 0x01000, Fma4Instructions = 0x02000, XopInstructions = 0x04000, PopcntInstructions = 0x08000, Sse4aInstructions = 0x10000, FmaInstructions = 0x20000, VexInstructions = 0x40000, Bmi2Instructions = 0x80000 } |
The list of available instructions is not easily described by a linear list of instruction sets. More... | |
Functions | |
const char * | versionString () |
constexpr unsigned int | versionNumber () |
template<typename V , typename Parent , typename Dimension , typename RM > | |
std::ostream & | operator<< (std::ostream &s, const Vc::MemoryBase< V, Parent, Dimension, RM > &m) |
Prints the contents of a Memory object into a stream object. More... | |
template<typename Mask , typename T > | |
enable_if< is_simd_mask< Mask >::value &&is_simd_vector< T >::value, T > | iif (const Mask &condition, const T &trueValue, const T &falseValue) |
Function to mimic the ternary operator '?:' (inline-if). More... | |
template<typename T > | |
constexpr T | iif (bool condition, const T &trueValue, const T &falseValue) |
Overload of the above for boolean conditions. More... | |
template<typename V , typename = enable_if<Traits::is_simd_vector<V>::value>> | |
std::pair< V, V > | interleave (const V &a, const V &b) |
Interleaves the entries from a and b into two vectors of the same type. More... | |
template<typename Container , typename T > | |
constexpr auto | makeContainer (std::initializer_list< T > list) -> decltype(make_container_helper< Container, T >::help(list)) |
Construct a container of Vc vectors from a std::initializer_list of scalar entries. More... | |
template<typename T , Vc::MallocAlignment A> | |
T * | malloc (size_t n) |
Allocates memory on the Heap with alignment and padding suitable for vectorized access. More... | |
template<typename T > | |
void | free (T *p) |
Frees memory that was allocated with Vc::malloc. More... | |
void | prefetchForOneRead (const void *addr) |
Prefetch the cacheline containing addr for a single read access. More... | |
void | prefetchForModify (const void *addr) |
Prefetch the cacheline containing addr for modification. More... | |
void | prefetchClose (const void *addr) |
Prefetch the cacheline containing addr to L1 cache. More... | |
void | prefetchMid (const void *addr) |
Prefetch the cacheline containing addr to L2 cache. More... | |
void | prefetchFar (const void *addr) |
Prefetch the cacheline containing addr to L3 cache. More... | |
template<typename V , typename T , typename Abi > | |
enable_if< (V::size()==Vector< T, Abi >::size()&&sizeof(typename V::VectorEntryType)==sizeof(typename Vector< T, Abi >::VectorEntryType)&&sizeof(V)==sizeof(Vector< T, Abi >)&&alignof(V)<=alignof(Vector< T, Abi >)), V > | reinterpret_components_cast (const Vector< T, Abi > &x) |
Constructs a new Vector object of type V from the Vector x , reinterpreting the bits of x for the new type V . More... | |
template<typename M > | |
constexpr WhereImpl::WhereMask< M > | where (const M &mask) |
Conditional assignment. More... | |
Variables | |
constexpr AlignedTag | Aligned |
Use this object for a flags parameter to request aligned loads and stores. More... | |
constexpr UnalignedTag | Unaligned |
Use this object for a flags parameter to request unaligned loads and stores. More... | |
constexpr StreamingTag | Streaming |
Use this object for a flags parameter to request streaming loads and stores. More... | |
constexpr LoadStoreFlags::LoadStoreFlags< PrefetchFlag<> > | PrefetchDefault |
Use this object for a flags parameter to request default software prefetches to be emitted. | |
constexpr VectorSpecialInitializerZero | Zero = {} |
The special object Vc::Zero can be used to construct Vector and Mask objects initialized to zero/false . | |
constexpr VectorSpecialInitializerOne | One = {} |
The special object Vc::One can be used to construct Vector and Mask objects initialized to one/true . | |
constexpr VectorSpecialInitializerIndexesFromZero | IndexesFromZero = {} |
The special object Vc::IndexesFromZero can be used to construct Vector objects initialized to values 0, 1, 2, 3, 4, ... | |
Compiler Identification Macros | |
#define | Vc_ICC __INTEL_COMPILER_BUILD_DATE |
This macro is defined to a number identifying the ICC version if the current translation unit is compiled with the Intel compiler. More... | |
#define | Vc_CLANG (__clang_major__ * 0x10000 + __clang_minor__ * 0x100 + __clang_patchlevel__) |
This macro is defined to a number identifying the Clang version if the current translation unit is compiled with the Clang compiler. More... | |
#define | Vc_APPLECLANG (__clang_major__ * 0x10000 + __clang_minor__ * 0x100 + __clang_patchlevel__) |
This macro is defined to a number identifying the Apple Clang version if the current translation unit is compiled with the Apple Clang compiler. More... | |
#define | Vc_GCC (__GNUC__ * 0x10000 + __GNUC_MINOR__ * 0x100 + __GNUC_PATCHLEVEL__) |
This macro is defined to a number identifying the GCC version if the current translation unit is compiled with the GCC compiler. More... | |
#define | Vc_MSVC _MSC_FULL_VER |
This macro is defined to a number identifying the Microsoft Visual C++ version if the current translation unit is compiled with the Visual C++ (MSVC) compiler. More... | |
Micro-Architecture Feature Tests | |
unsigned int | extraInstructionsSupported () |
Determines the extra instructions supported by the current CPU. More... | |
bool | isImplementationSupported (Vc::Implementation impl) |
Tests whether the given implementation is supported by the system the code is executing on. More... | |
Vc::Implementation | bestImplementationSupported () |
Determines the best supported implementation for the current system. More... | |
bool | currentImplementationSupported () |
Tests that the CPU and Operating System support the vector unit which was compiled for. More... | |
Version Macros | |
#define | Vc_VERSION_STRING "1.3.2-dev" |
Contains the version string of the Vc headers. More... | |
#define | Vc_VERSION_NUMBER 0x010305 |
Contains the encoded version number of the Vc headers. More... | |
#define | Vc_VERSION_CHECK(major, minor, patch) ((major << 16) | (minor << 8) | (patch << 1)) |
Helper macro to compare against an encoded version number. More... | |
SIMD Support Feature Macros | |
#define | Vc_IMPL_XOP |
This macro is defined if the current translation unit is compiled with XOP instruction support. | |
#define | Vc_IMPL_FMA4 |
This macro is defined if the current translation unit is compiled with FMA4 instruction support. | |
#define | Vc_IMPL_F16C |
This macro is defined if the current translation unit is compiled with F16C instruction support. | |
#define | Vc_IMPL_POPCNT |
This macro is defined if the current translation unit is compiled with POPCNT instruction support. | |
#define | Vc_IMPL_SSE4a |
This macro is defined if the current translation unit is compiled with SSE4a instruction support. | |
#define | Vc_IMPL_Scalar |
This macro is defined if the current translation unit is compiled without any SIMD support. | |
#define | Vc_IMPL_SSE |
This macro is defined if the current translation unit is compiled with any version of SSE (but not AVX). | |
#define | Vc_IMPL_SSE2 |
This macro is defined if the current translation unit is compiled with SSE2 instruction support (excluding SSE3 and up). | |
#define | Vc_IMPL_SSE3 |
This macro is defined if the current translation unit is compiled with SSE3 instruction support (excluding SSSE3 and up). | |
#define | Vc_IMPL_SSSE3 |
This macro is defined if the current translation unit is compiled with SSSE3 instruction support (excluding SSE4.1 and up). | |
#define | Vc_IMPL_SSE4_1 |
This macro is defined if the current translation unit is compiled with SSE4.1 instruction support (excluding SSE4.2 and up). | |
#define | Vc_IMPL_SSE4_2 |
This macro is defined if the current translation unit is compiled with SSE4.2 instruction support (excluding AVX and up). | |
#define | Vc_IMPL_AVX |
This macro is defined if the current translation unit is compiled with AVX instruction support (excluding AVX2 and up). | |
#define | Vc_IMPL_AVX2 |
This macro is defined if the current translation unit is compiled with AVX2 instruction support. | |
#define | Vc_IMPL_MIC |
This macro is defined if the current translation unit is compiled for the Knights Corner Xeon Phi instruction set. | |
SIMD Vector Size Macros | |
#define | Vc_DOUBLE_V_SIZE |
An integer (for use with the preprocessor) that gives the number of entries in a double_v. | |
#define | Vc_FLOAT_V_SIZE |
An integer (for use with the preprocessor) that gives the number of entries in a float_v. | |
#define | Vc_INT_V_SIZE |
An integer (for use with the preprocessor) that gives the number of entries in a int_v. | |
#define | Vc_UINT_V_SIZE |
An integer (for use with the preprocessor) that gives the number of entries in a uint_v. | |
#define | Vc_SHORT_V_SIZE |
An integer (for use with the preprocessor) that gives the number of entries in a short_v. | |
#define | Vc_USHORT_V_SIZE |
An integer (for use with the preprocessor) that gives the number of entries in a ushort_v. | |
Boolean Reductions | |
template<typename Mask > | |
constexpr bool | all_of (const Mask &m) |
Returns whether all entries in the mask m are true . | |
constexpr bool | all_of (bool b) |
Returns b . | |
template<typename Mask > | |
constexpr bool | any_of (const Mask &m) |
Returns whether at least one entry in the mask m is true . | |
constexpr bool | any_of (bool b) |
Returns b . | |
template<typename Mask > | |
constexpr bool | none_of (const Mask &m) |
Returns whether all entries in the mask m are false . | |
constexpr bool | none_of (bool b) |
Returns !b . | |
template<typename Mask > | |
constexpr bool | some_of (const Mask &m) |
Returns whether at least one entry in m is true and at least one entry in m is false . | |
constexpr bool | some_of (bool) |
Returns false . | |
#define Vc_ICC __INTEL_COMPILER_BUILD_DATE |
#define Vc_CLANG (__clang_major__ * 0x10000 + __clang_minor__ * 0x100 + __clang_patchlevel__) |
#define Vc_APPLECLANG (__clang_major__ * 0x10000 + __clang_minor__ * 0x100 + __clang_patchlevel__) |
#define Vc_GCC (__GNUC__ * 0x10000 + __GNUC_MINOR__ * 0x100 + __GNUC_PATCHLEVEL__) |
#define Vc_MSVC _MSC_FULL_VER |
#define Vc_VERSION_STRING "1.3.2-dev" |
Contains the version string of the Vc headers.
Same as Vc::versionString().
Definition at line 40 of file version.h.
Referenced by Vc::versionString().
#define Vc_VERSION_NUMBER 0x010305 |
Contains the encoded version number of the Vc headers.
Same as Vc::versionNumber().
Definition at line 46 of file version.h.
Referenced by Vc::versionNumber().
#define Vc_VERSION_CHECK | ( | major, | |
minor, | |||
patch | |||
) | ((major << 16) | (minor << 8) | (patch << 1)) |
#define Vc_DECLARE_ALLOCATOR | ( | Type | ) |
Convenience macro to set the default allocator for a given Type
to Vc::Allocator.
Type | Your type that you want to use with STL containers. |
using CurrentImplementation = ImplementationT< > |
Identifies the Vc implementation used in the current translation unit.
using vector = Common::AdaptSubscriptOperator<std::vector<T, Allocator>> |
An adapted std::vector
container with an additional subscript operator which implements gather and scatter operations.
Example:
using VectorAlignedBase = AlignedBase< Detail::max(alignof(Vector<float>), alignof(Vector<double>), alignof(Vector<ullong>), alignof(Vector<llong>), alignof(Vector<ulong>), alignof(Vector<long>), alignof(Vector<uint>), alignof(Vector<int>), alignof(Vector<ushort>), alignof(Vector<short>), alignof(Vector<uchar>), alignof(Vector<schar>))> |
Helper type to ensure suitable alignment for any Vc::Vector<T> type (using the default VectorAbi).
This class reimplements the new
and delete
operators to align objects allocated on the heap suitably for objects of Vc::Vector<T> type. This is necessary since the standard new
operator does not adhere to the alignment requirements of the type.
Definition at line 90 of file alignedbase.h.
using VectorAlignedBaseT = AlignedBase<alignof(V)> |
Variant of the above type ensuring suitable alignment only for the specified vector type V
.
Definition at line 100 of file alignedbase.h.
using MemoryAlignedBase = AlignedBase< Detail::max(Vector<float>::MemoryAlignment, Vector<double>::MemoryAlignment, Vector<ullong>::MemoryAlignment, Vector<llong>::MemoryAlignment, Vector<ulong>::MemoryAlignment, Vector<long>::MemoryAlignment, Vector<uint>::MemoryAlignment, Vector<int>::MemoryAlignment, Vector<ushort>::MemoryAlignment, Vector<short>::MemoryAlignment, Vector<uchar>::MemoryAlignment, Vector<schar>::MemoryAlignment)> |
Helper class to ensure suitable alignment for arrays of scalar objects for any Vc::Vector<T> type (using the default VectorAbi).
This class reimplements the new
and delete
operators to align objects allocated on the heap suitably for arrays of type Vc::Vector<T>::EntryType
. Subsequent load and store operations are safe to use the aligned variant.
Definition at line 122 of file alignedbase.h.
using MemoryAlignedBaseT = AlignedBase<V::MemoryAlignment> |
Variant of the above type ensuring suitable alignment only for the specified vector type V
.
Definition at line 132 of file alignedbase.h.
enum MallocAlignment |
Enum that specifies the alignment and padding restrictions to use for memory allocation with Vc::malloc.
enum Implementation : std::uint_least32_t |
Enum to identify a certain SIMD instruction set.
You can use CurrentImplementation for the currently active implementation.
enum ExtraInstructions : std::uint_least32_t |
The list of available instructions is not easily described by a linear list of instruction sets.
On x86 the following instruction sets always include their predecessors: SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2
But there are additional instructions that are not necessarily required by this list. These are covered in this enum.
unsigned int Vc::extraInstructionsSupported | ( | ) |
Determines the extra instructions supported by the current CPU.
bool Vc::isImplementationSupported | ( | Vc::Implementation | impl | ) |
Tests whether the given implementation is supported by the system the code is executing on.
true
if the OS and hardware support execution of instructions defined by impl
. false
otherwiseimpl | The SIMD target to test for. |
Vc::Implementation Vc::bestImplementationSupported | ( | ) |
Determines the best supported implementation for the current system.
|
inline |
Tests that the CPU and Operating System support the vector unit which was compiled for.
This function should be called before any other Vc functionality is used. It checks whether the program will work. If this function returns false
then the program should exit with a useful error message before the OS has to kill it because of an invalid instruction exception.
If the program continues and makes use of any vector features not supported by hard- or software then the program will crash.
Example:
true
if the OS and hardware support execution of the currently selected SIMD instructions. false
otherwise
|
inline |
constexpr unsigned int Vc::versionNumber | ( | ) |
|
inline |
Prints the contents of a Memory object into a stream object.
will output (with SSE):
{[0, 1, 2, 3] [4, 5, 6, 7] [8, 9, 0, 0]}
s | Any standard C++ ostream object. For example std::cout or a std::stringstream object. |
m | Any Vc::Memory object. |
|
inlinedelete |
Function to mimic the ternary operator '?:' (inline-if).
condition | Determines which values are returned. This is analog to the first argument to the ternary operator. |
trueValue | The values to return where condition is true . |
falseValue | The values to return where condition is false . |
trueValue
and falseValue
, according to condition
.So instead of the scalar variant
you'd write
Assuming a
has the values [0, 3, 5, 1], b
is [1, 1, 1, 1], and c
is [1, 2, 3, 4], then x will be [2, 2, 3, 5].
constexpr T Vc::iif | ( | bool | condition, |
const T & | trueValue, | ||
const T & | falseValue | ||
) |
Overload of the above for boolean conditions.
This typically results in direct use of the ternary operator. This function makes it easier to switch from a Vc type to a builtin type.
condition | Determines which value is returned. This is analog to the first argument to the ternary operator. |
trueValue | The value to return if condition is true . |
falseValue | The value to return if condition is false . |
trueValue
or falseValue
, depending on condition
. std::pair<V, V> Vc::interleave | ( | const V & | a, |
const V & | b | ||
) |
Interleaves the entries from a
and b
into two vectors of the same type.
The order in the returned vector contains the elements a[0], b[0], a[1], b[1], a[2], b[2], a[3], b[3], ...
.
Example:
a | input vector whose data will appear at even indexes in the output |
b | input vector whose data will appear at odd indexes in the output |
a
and b
interleaved Definition at line 55 of file interleave.h.
constexpr auto Vc::makeContainer | ( | std::initializer_list< T > | list | ) | -> decltype(make_container_helper<Container, T>::help(list)) |
Construct a container of Vc vectors from a std::initializer_list of scalar entries.
Container | The container type to construct. |
T | The scalar type to use for the initializer_list. |
list | An initializer list of arbitrary size. The type of the entries is important! If you pass a list of integers you will get a container filled with Vc::int_v objects. If, instead, you want to have a container of Vc::float_v objects, be sure the include a period (.) and the 'f' postfix in the literals. Alternatively, you can pass the type as second template argument to makeContainer. |
list
does not match the number of values in the returned container object, the remaining values in the returned object will be zero-initialized.Example:
Definition at line 138 of file makeContainer.h.
|
inline |
Allocates memory on the Heap with alignment and padding suitable for vectorized access.
Memory that was allocated with this function must be released with Vc::free! Other methods might work but are not portable.
n | Specifies the number of objects the allocated memory must be able to store. |
T | The type of the allocated memory. Note, that the constructor is not called. |
A | Determines the alignment of the memory. See Vc::MallocAlignment. |
A
. Thus if you request memory for 21 int objects, aligned via Vc::AlignOnCacheline, you can safely read a full cacheline until the end of the array, without generating an out-of-bounds access. For a cacheline size of 64 Bytes and an int size of 4 Bytes you would thus get an array of 128 Bytes to work with.
|
inline |
Frees memory that was allocated with Vc::malloc.
p | The pointer to the memory to be freed. |
T | The type of the allocated memory. |
Definition at line 102 of file memory.h.
Referenced by Memory< V, 0u, 0u, true >::~Memory().
|
inline |
|
inline |
Prefetch the cacheline containing addr
for modification.
This prefetch evicts data from the cache. So use it only for data you really will use. When the target system supports it the cacheline will be marked as modified while prefetching, saving work later on.
addr | The cacheline containing addr will be prefetched. |
|
inline |
|
inline |
|
inline |
|
inline |
Constructs a new Vector object of type V
from the Vector x
, reinterpreting the bits of x
for the new type V
.
This function is only applicable if:
sizeof
of the input and output types is equalVectorEntryTypes
of input and output have equal sizeof
V | The requested type to change x into. |
x | The Vector to reinterpret as an object of type V . |
V
.Abi
, though. constexpr WhereImpl::WhereMask<M> Vc::where | ( | const M & | mask | ) |
Conditional assignment.
Since compares between SIMD vectors do not return a single boolean, but rather a vector of booleans (mask), one often cannot use if / else statements. Instead, one needs to state that only a subset of entries of a given SIMD vector should be modified. The where
function can be prepended to any assignment operation to execute a masked assignment.
mask | The mask that selects the entries in the target vector that will be modified. |
where(mask) | x = y
or where(mask)(x) = y
)Example:
The block following the if statement in f1
will be executed if x < 2
evaluates to true
. If T
is a scalar type you normally get what you expect. But if T
is a SIMD vector type, the comparison will use the implicit conversion from a mask to bool, meaning all_of(x < 2)
.
Most of the time the required operation is a masked assignment as stated in f2
.
Definition at line 229 of file where.h.
Referenced by Vc::iif().
constexpr AlignedTag Aligned |
Use this object for a flags
parameter to request aligned loads and stores.
It specifies that a load/store can expect a memory address that is aligned on the correct boundary. (i.e. MemoryAlignment
)
Definition at line 183 of file loadstoreflags.h.
Referenced by SimdArray< T, N, V, Wt >::reversed(), and SimdArray< T, N, V, Wt >::rotated().
constexpr UnalignedTag Unaligned |
Use this object for a flags
parameter to request unaligned loads and stores.
It specifies that a load/store can not expect a memory address that is aligned on the correct boundary. (i.e. alignment is less than MemoryAlignment
)
Definition at line 196 of file loadstoreflags.h.
Referenced by SimdArray< T, N, V, Wt >::reversed(), SimdArray< T, N, V, Wt >::rotated(), and MemoryBase< V, Memory< V, Size, 0u, InitPadding >, 1, void >::vector().
constexpr StreamingTag Streaming |
Use this object for a flags
parameter to request streaming loads and stores.
It specifies that the cache should be bypassed for the given load/store. Whether this will actually be done depends on the target system's capabilities.
Streaming stores can be interesting when the code calculates values that, after being written to memory, will not be used for a long time or used by a different thread.
Definition at line 211 of file loadstoreflags.h.