commit a43e02cf87d0c1ddce1719d93478f0f6a3a095e8 Author: Greg Kroah-Hartman Date: Thu Feb 20 11:06:19 2014 -0800 Linux 3.10.31 commit 6843d9254c8cd2decb30edaf7deffaad4d244c51 Author: Xishi Qiu Date: Fri Feb 14 10:33:35 2014 +0800 mm: fix process accidentally killed by mce because of huge page migration Based on c8721bbbdd36382de51cd6b7a56322e0acca2414 upstream, but only the bugfix portion pulled out. Hi Naoya or Greg, We found a bug in 3.10.x. The problem is that we accidentally have a hwpoisoned hugepage in free hugepage list. It could happend in the the following scenario: process A process B migrate_huge_page put_page (old hugepage) linked to free hugepage list hugetlb_fault hugetlb_no_page alloc_huge_page dequeue_huge_page_vma dequeue_huge_page_node (steal hwpoisoned hugepage) set_page_hwpoison_huge_page dequeue_hwpoisoned_huge_page (fail to dequeue) I tested this bug, one process keeps allocating huge page, and I use sysfs interface to soft offline a huge page, then received: "MCE: Killing UCP:2717 due to hardware memory corruption fault at 8200034" Upstream kernel is free from this bug because of these two commits: f15bdfa802bfa5eb6b4b5a241b97ec9fa1204a35 mm/memory-failure.c: fix memory leak in successful soft offlining c8721bbbdd36382de51cd6b7a56322e0acca2414 mm: memory-hotplug: enable memory hotplug to handle hugepage The first one, although the problem is about memory leak, this patch moves unset_migratetype_isolate(), which is important to avoid the race. The latter is not a bug fix and it's too big, so I rewrite a small one. The following patch can fix this bug.(please apply f15bdfa802bf first) Signed-off-by: Xishi Qiu Reviewed-by: Naoya Horiguchi Signed-off-by: Greg Kroah-Hartman commit 2d9258e499abbbe55fb60a26995e378f2b51d513 Author: Jan Kara Date: Fri Oct 4 09:29:12 2013 -0400 IB/qib: Convert qib_user_sdma_pin_pages() to use get_user_pages_fast() commit 603e7729920e42b3c2f4dbfab9eef4878cb6e8fa upstream. qib_user_sdma_queue_pkts() gets called with mmap_sem held for writing. Except for get_user_pages() deep down in qib_user_sdma_pin_pages() we don't seem to need mmap_sem at all. Even more interestingly the function qib_user_sdma_queue_pkts() (and also qib_user_sdma_coalesce() called somewhat later) call copy_from_user() which can hit a page fault and we deadlock on trying to get mmap_sem when handling that fault. So just make qib_user_sdma_pin_pages() use get_user_pages_fast() and leave mmap_sem locking for mm. This deadlock has actually been observed in the wild when the node is under memory pressure. Reviewed-by: Mike Marciniszyn Signed-off-by: Jan Kara Signed-off-by: Roland Dreier [Backported to 3.10: (Thanks to Ben Huthings) - Adjust context - Adjust indentation and nr_pages argument in qib_user_sdma_pin_pages()] Signed-off-by: Ben Hutchings Signed-off-by: Mike Marciniszyn Signed-off-by: Greg Kroah-Hartman commit b0d4c0f8122abd6069312ccc20089d5c1c19c772 Author: Naoya Horiguchi Date: Wed Jul 3 15:02:37 2013 -0700 mm/memory-failure.c: fix memory leak in successful soft offlining commit f15bdfa802bfa5eb6b4b5a241b97ec9fa1204a35 upstream. After a successful page migration by soft offlining, the source page is not properly freed and it's never reusable even if we unpoison it afterward. This is caused by the race between freeing page and setting PG_hwpoison. In successful soft offlining, the source page is put (and the refcount becomes 0) by putback_lru_page() in unmap_and_move(), where it's linked to pagevec and actual freeing back to buddy is delayed. So if PG_hwpoison is set for the page before freeing, the freeing does not functions as expected (in such case freeing aborts in free_pages_prepare() check.) This patch tries to make sure to free the source page before setting PG_hwpoison on it. To avoid reallocating, the page keeps MIGRATE_ISOLATE until after setting PG_hwpoison. This patch also removes obsolete comments about "keeping elevated refcount" because what they say is not true. Unlike memory_failure(), soft_offline_page() uses no special page isolation code, and the soft-offlined pages have no elevated. Signed-off-by: Naoya Horiguchi Cc: Andi Kleen Cc: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Cc: Xishi Qiu Signed-off-by: Greg Kroah-Hartman commit 1fc42b84b45d4a58641821c3c0b601565523ad7d Author: Stanislaw Gruszka Date: Tue Feb 4 09:07:09 2014 +0100 pinctrl: protect pinctrl_list add commit 7b320cb1ed2dbd2c5f2a778197baf76fd6bf545a upstream. We have few fedora bug reports about list corruption on pinctrl, for example: https://bugzilla.redhat.com/show_bug.cgi?id=1051918 Most likely corruption happen due lack of protection of pinctrl_list when adding new nodes to it. Patch corrects that. Fixes: 42fed7ba44e ("pinctrl: move subsystem mutex to pinctrl_dev struct") Signed-off-by: Stanislaw Gruszka Acked-by: Stephen Warren Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit c2b570f3e4c62ca9fa82ef4e8e4a2bcb6ca4de8b Author: Tony Prisk Date: Thu Jan 23 21:57:33 2014 +1300 pinctrl: vt8500: Change devicetree data parsing commit f17248ed868767567298e1cdf06faf8159a81f7c upstream. Due to an assumption in the VT8500 pinctrl driver, the value passed from devicetree for 'wm,pull' was not explicitly translated before being passed to pinconf. Since v3.10, changes to 'enum pin_config_param', PIN_CONFIG_BIAS_PULL_(UP/DOWN) no longer map 1-to-1 with the expected values in devicetree. This patch adds a small translation between the devicetree values (0..2) and the enum pin_config_param equivalent values. Signed-off-by: Tony Prisk Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit f878b94f967e1c6ee0d3c220063db3ff52fe0448 Author: Peter Oberparleiter Date: Thu Feb 6 15:58:20 2014 +0100 x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y commit 6583327c4dd55acbbf2a6f25e775b28b3abf9a42 upstream. Commit d61931d89b, "x86: Add optimized popcnt variants" introduced compile flag -fcall-saved-rdi for lib/hweight.c. When combined with options -fprofile-arcs and -O2, this flag causes gcc to generate broken constructor code. As a result, a 64 bit x86 kernel compiled with CONFIG_GCOV_PROFILE_ALL=y prints message "gcov: could not create file" and runs into sproadic BUGs during boot. The gcc people indicate that these kinds of problems are endemic when using ad hoc calling conventions. It is therefore best to treat any file compiled with ad hoc calling conventions as an isolated environment and avoid things like profiling or coverage analysis, since those subsystems assume a "normal" calling conventions. This patch avoids the bug by excluding lib/hweight.o from coverage profiling. Reported-by: Meelis Roos Cc: Andrew Morton Signed-off-by: Peter Oberparleiter Link: http://lkml.kernel.org/r/52F3A30C.7050205@linux.vnet.ibm.com Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit f19148c7959cd94e70a32a92d05a19e92d9826e8 Author: Dave Jones Date: Thu Jan 30 00:17:09 2014 -0300 mxl111sf: Fix compile when CONFIG_DVB_USB_MXL111SF is unset commit 13e1b87c986100169b0695aeb26970943665eda9 upstream. Fix the following build error: drivers/media/usb/dvb-usb-v2/ mxl111sf-tuner.h:72:9: error: expected ‘;’, ‘,’ or ‘)’ before ‘struct’ struct mxl111sf_tuner_config *cfg) Signed-off-by: Dave Jones Signed-off-by: Michael Krufky Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 728311f5b50b1a6cc6b5363b4ef89dd466824fb3 Author: Antti Palosaari Date: Thu Jan 16 08:59:30 2014 -0300 af9035: add ID [2040:f900] Hauppauge WinTV-MiniStick 2 commit f2e4c5e004691dfe37d0e4b363296f28abdb9bc7 upstream. Add USB ID [2040:f900] for Hauppauge WinTV-MiniStick 2. Device is build upon IT9135 chipset. Tested-by: Stefan Becker Signed-off-by: Antti Palosaari Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 9fa5c5268125c5ba96df5c18a9eb8d7b7e5d2a4f Author: Mel Gorman Date: Tue Jan 21 14:33:21 2014 -0800 x86: mm: change tlb_flushall_shift for IvyBridge commit f98b7a772ab51b52ca4d2a14362fc0e0c8a2e0f3 upstream. There was a large performance regression that was bisected to commit 611ae8e3 ("x86/tlb: enable tlb flush range support for x86"). This patch simply changes the default balance point between a local and global flush for IvyBridge. In the interest of allowing the tests to be reproduced, this patch was tested using mmtests 0.15 with the following configurations configs/config-global-dhp__tlbflush-performance configs/config-global-dhp__scheduler-performance configs/config-global-dhp__network-performance Results are from two machines Ivybridge 4 threads: Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz Ivybridge 8 threads: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz Page fault microbenchmark showed nothing interesting. Ebizzy was configured to run multiple iterations and threads. Thread counts ranged from 1 to NR_CPUS*2. For each thread count, it ran 100 iterations and each iteration lasted 10 seconds. Ivybridge 4 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 6395.44 ( 0.00%) 6789.09 ( 6.16%) Mean 2 7012.85 ( 0.00%) 8052.16 ( 14.82%) Mean 3 6403.04 ( 0.00%) 6973.74 ( 8.91%) Mean 4 6135.32 ( 0.00%) 6582.33 ( 7.29%) Mean 5 6095.69 ( 0.00%) 6526.68 ( 7.07%) Mean 6 6114.33 ( 0.00%) 6416.64 ( 4.94%) Mean 7 6085.10 ( 0.00%) 6448.51 ( 5.97%) Mean 8 6120.62 ( 0.00%) 6462.97 ( 5.59%) Ivybridge 8 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 7336.65 ( 0.00%) 7787.02 ( 6.14%) Mean 2 8218.41 ( 0.00%) 9484.13 ( 15.40%) Mean 3 7973.62 ( 0.00%) 8922.01 ( 11.89%) Mean 4 7798.33 ( 0.00%) 8567.03 ( 9.86%) Mean 5 7158.72 ( 0.00%) 8214.23 ( 14.74%) Mean 6 6852.27 ( 0.00%) 7952.45 ( 16.06%) Mean 7 6774.65 ( 0.00%) 7536.35 ( 11.24%) Mean 8 6510.50 ( 0.00%) 6894.05 ( 5.89%) Mean 12 6182.90 ( 0.00%) 6661.29 ( 7.74%) Mean 16 6100.09 ( 0.00%) 6608.69 ( 8.34%) Ebizzy hits the worst case scenario for TLB range flushing every time and it shows for these Ivybridge CPUs at least that the default choice is a poor on. The patch addresses the problem. Next was a tlbflush microbenchmark written by Alex Shi at http://marc.info/?l=linux-kernel&m=133727348217113 . It measures access costs while the TLB is being flushed. The expectation is that if there are always full TLB flushes that the benchmark would suffer and it benefits from range flushing There are 320 iterations of the test per thread count. The number of entries is randomly selected with a min of 1 and max of 512. To ensure a reasonably even spread of entries, the full range is broken up into 8 sections and a random number selected within that section. iteration 1, random number between 0-64 iteration 2, random number between 64-128 etc This is still a very weak methodology. When you do not know what are typical ranges, random is a reasonable choice but it can be easily argued that the opimisation was for smaller ranges and an even spread is not representative of any workload that matters. To improve this, we'd need to know the probability distribution of TLB flush range sizes for a set of workloads that are considered "common", build a synthetic trace and feed that into this benchmark. Even that is not perfect because it would not account for the time between flushes but there are limits of what can be reasonably done and still be doing something useful. If a representative synthetic trace is provided then this benchmark could be revisited and the shift values retuned. Ivybridge 4 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 10.50 ( 0.00%) 10.50 ( 0.03%) Mean 2 17.59 ( 0.00%) 17.18 ( 2.34%) Mean 3 22.98 ( 0.00%) 21.74 ( 5.41%) Mean 5 47.13 ( 0.00%) 46.23 ( 1.92%) Mean 8 43.30 ( 0.00%) 42.56 ( 1.72%) Ivybridge 8 threads 3.13.0-rc7 3.13.0-rc7 vanilla altshift-v3 Mean 1 9.45 ( 0.00%) 9.36 ( 0.93%) Mean 2 9.37 ( 0.00%) 9.70 ( -3.54%) Mean 3 9.36 ( 0.00%) 9.29 ( 0.70%) Mean 5 14.49 ( 0.00%) 15.04 ( -3.75%) Mean 8 41.08 ( 0.00%) 38.73 ( 5.71%) Mean 13 32.04 ( 0.00%) 31.24 ( 2.49%) Mean 16 40.05 ( 0.00%) 39.04 ( 2.51%) For both CPUs, average access time is reduced which is good as this is the benchmark that was used to tune the shift values in the first place albeit it is now known *how* the benchmark was used. The scheduler benchmarks were somewhat inconclusive. They showed gains and losses and makes me reconsider how stable those benchmarks really are or if something else might be interfering with the test results recently. Network benchmarks were inconclusive. Almost all results were flat except for netperf-udp tests on the 4 thread machine. These results were unstable and showed large variations between reboots. It is unknown if this is a recent problems but I've noticed before that netperf-udp results tend to vary. Based on these results, changing the default for Ivybridge seems like a logical choice. Signed-off-by: Mel Gorman Tested-by: Davidlohr Bueso Reviewed-by: Alex Shi Reviewed-by: Rik van Riel Signed-off-by: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/n/tip-cqnadffh1tiqrshthRj3Esge@git.kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 7e405afc255142ceb5d8cc3d97a7cffac46f0e4e Author: KOSAKI Motohiro Date: Thu Feb 6 12:04:28 2014 -0800 mm: __set_page_dirty uses spin_lock_irqsave instead of spin_lock_irq commit 227d53b397a32a7614667b3ecaf1d89902fb6c12 upstream. To use spin_{un}lock_irq is dangerous if caller disabled interrupt. During aio buffer migration, we have a possibility to see the following call stack. aio_migratepage [disable interrupt] migrate_page_copy clear_page_dirty_for_io set_page_dirty __set_page_dirty_buffers __set_page_dirty spin_lock_irq This mean, current aio migration is a deadlockable. spin_lock_irqsave is a safer alternative and we should use it. Signed-off-by: KOSAKI Motohiro Reported-by: David Rientjes rientjes@google.com> Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit dce0b4fcf24d431eea39d7675c72a8b830ae6a65 Author: KOSAKI Motohiro Date: Thu Feb 6 12:04:24 2014 -0800 mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq() commit a85d9df1ea1d23682a0ed1e100e6965006595d06 upstream. During aio stress test, we observed the following lockdep warning. This mean AIO+numa_balancing is currently deadlockable. The problem is, aio_migratepage disable interrupt, but __set_page_dirty_nobuffers unintentionally enable it again. Generally, all helper function should use spin_lock_irqsave() instead of spin_lock_irq() because they don't know caller at all. other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&ctx->completion_lock)->rlock); lock(&(&ctx->completion_lock)->rlock); *** DEADLOCK *** dump_stack+0x19/0x1b print_usage_bug+0x1f7/0x208 mark_lock+0x21d/0x2a0 mark_held_locks+0xb9/0x140 trace_hardirqs_on_caller+0x105/0x1d0 trace_hardirqs_on+0xd/0x10 _raw_spin_unlock_irq+0x2c/0x50 __set_page_dirty_nobuffers+0x8c/0xf0 migrate_page_copy+0x434/0x540 aio_migratepage+0xb1/0x140 move_to_new_page+0x7d/0x230 migrate_pages+0x5e5/0x700 migrate_misplaced_page+0xbc/0xf0 do_numa_page+0x102/0x190 handle_pte_fault+0x241/0x970 handle_mm_fault+0x265/0x370 __do_page_fault+0x172/0x5a0 do_page_fault+0x1a/0x70 page_fault+0x28/0x30 Signed-off-by: KOSAKI Motohiro Cc: Larry Woodman Cc: Rik van Riel Cc: Johannes Weiner Acked-by: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 99d35c4d1e69c7963f46b996f97f5e843c48de97 Author: Takashi Iwai Date: Wed Feb 5 07:28:10 2014 +0100 ALSA: hda - Add missing mixer widget for AD1983 commit c7579fed1f1b2567529aea64ef19871337403ab3 upstream. The mixer widget on AD1983 at NID 0x0e was missing in the commit [f2f8be43c5c9: ALSA: hda - Add aamix NID to AD codecs]. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70011 Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 1e8968b31ae41b3e3480f87ef51267cfaca898b4 Author: Takashi Iwai Date: Mon Feb 3 11:02:10 2014 +0100 ALSA: hda - Fix missing VREF setup for Mac Pro 1,1 commit c20f31ec421ea4fabea5e95a6afd46c5f41e5599 upstream. Mac Pro 1,1 with ALC889A codec needs the VREF setup on NID 0x18 to VREF50, in order to make the speaker working. The same fixup was already needed for MacBook Air 1,1, so we can reuse it. Reported-by: Nicolai Beuermann Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit c3d49ed21de129e30c8f90cee43162919c4e17a2 Author: Takashi Iwai Date: Mon Feb 3 09:37:59 2014 +0100 ALSA: usb-audio: Add missing kconfig dependecy commit 4fa71c1550a857ff1dbfe9e99acc1f4cfec5f0d0 upstream. The commit 44dcbbb1cd61 introduced the usage of bitreverse helpers but forgot to add the dependency. This patch adds the selection for CONFIG_BITREVERSE. Fixes: 44dcbbb1cd61 ('ALSA: snd-usb: add support for bit-reversed byte formats') Reported-by: Fengguang Wu Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 02599bad3774a5147eb8d98df7c9362fdc1a50c6 Author: Vinayak Kale Date: Wed Feb 5 09:34:36 2014 +0000 arm64: add DSB after icache flush in __flush_icache_all() commit 5044bad43ee573d0b6d90e3ccb7a40c2c7d25eb4 upstream. Add DSB after icache flush to complete the cache maintenance operation. The function __flush_icache_all() is used only for user space mappings and an ISB is not required because of an exception return before executing user instructions. An exception return would behave like an ISB. Signed-off-by: Vinayak Kale Acked-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit fb569d15d867a06e89b1be8278404b6fbf6b5bde Author: Nathan Lynch Date: Wed Feb 5 05:53:04 2014 +0000 arm64: vdso: fix coarse clock handling commit 069b918623e1510e58dacf178905a72c3baa3ae4 upstream. When __kernel_clock_gettime is called with a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE clock id, it returns incorrectly to whatever the caller has placed in x2 ("ret x2" to return from the fast path). Fix this by saving x30/LR to x2 only in code that will call __do_get_tspec, restoring x30 afterward, and using a plain "ret" to return from the routine. Also: while the resulting tv_nsec value for CLOCK_REALTIME and CLOCK_MONOTONIC must be computed using intermediate values that are left-shifted by cs_shift (x12, set by __do_get_tspec), the results for coarse clocks should be calculated using unshifted values (xtime_coarse_nsec is in units of actual nanoseconds). The current code shifts intermediate values by x12 unconditionally, but x12 is uninitialized when servicing a coarse clock. Fix this by setting x12 to 0 once we know we are dealing with a coarse clock id. Signed-off-by: Nathan Lynch Acked-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit f35f27e775d6f8aa265fb698975c9c95f2757ef4 Author: Catalin Marinas Date: Tue Feb 4 16:01:31 2014 +0000 arm64: Invalidate the TLB when replacing pmd entries during boot commit a55f9929a9b257f84b6cc7b2397379cabd744a22 upstream. With the 64K page size configuration, __create_page_tables in head.S maps enough memory to get started but using 64K pages rather than 512M sections with a single pgd/pud/pmd entry pointing to a pte table. create_mapping() may override the pgd/pud/pmd table entry with a block (section) one if the RAM size is more than 512MB and aligned correctly. For the end of this block to be accessible, the old TLB entry must be invalidated. Reported-by: Mark Salter Tested-by: Mark Salter Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit 6737eaebff6297e5b6aeb458a4b4ad4b61df8f57 Author: Will Deacon Date: Tue Feb 4 14:41:26 2014 +0000 arm64: vdso: prevent ld from aligning PT_LOAD segments to 64k commit 40507403485fcb56b83d6ddfc954e9b08305054c upstream. Whilst the text segment for our VDSO is marked as PT_LOAD in the ELF headers, it is mapped by the kernel and not actually subject to demand-paging. ld doesn't realise this, and emits a p_align field of 64k (the maximum supported page size), which conflicts with the load address picked by the kernel on 4k systems, which will be 4k aligned. This causes GDB to fail with "Failed to read a valid object file image from memory" when attempting to load the VDSO. This patch passes the -n option to ld, which prevents it from aligning PT_LOAD segments to the maximum page size. Reported-by: Kyle McMartin Acked-by: Kyle McMartin Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit b666f382900c74f23c78556777041c2ceb7c2b24 Author: Nathan Lynch Date: Mon Feb 3 19:48:52 2014 +0000 arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE commit d4022a335271a48cce49df35d825897914fbffe3 upstream. Update wall-to-monotonic fields in the VDSO data page unconditionally. These are used to service CLOCK_MONOTONIC_COARSE, which is not guarded by use_syscall. Signed-off-by: Nathan Lynch Acked-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit 86f06cac32dc51640a6b15eeac52a644031d7617 Author: Lior Amsalem Date: Mon Nov 25 17:26:44 2013 +0100 irqchip: armada-370-xp: fix IPI race condition commit a6f089e95b1e08cdea9633d50ad20aa5d44ba64d upstream. In the Armada 370/XP driver, when we receive an IRQ 0, we read the list of doorbells that caused the interrupt from register ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS. This gives the list of IPIs that were generated. However, instead of acknowledging only the IPIs that were generated, we acknowledge *all* the IPIs, by writing ~IPI_DOORBELL_MASK in the ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS register. This creates a race condition: if a new IPI that isn't part of the ones read into the temporary "ipimask" variable is fired before we acknowledge all IPIs, then we will simply loose it. This is causing scheduling hangs on SMP intensive workloads. It is important to mention that this ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS register has the following behavior: "A CPU write of 0 clears the bits in this field. A CPU write of 1 has no effect". This is what allows us to simply write ~ipimask to acknoledge the handled IPIs. Notice that the same problem is present in the MSI implementation, but it will be fixed as a separate patch, so that this IPI fix can be pushed to older stable versions as appropriate (all the way to 3.8), while the MSI code only appeared in 3.13. Signed-off-by: Lior Amsalem Signed-off-by: Thomas Petazzoni Fixes: 344e873e5657e8dc0 'arm: mvebu: Add IPI support via doorbells' Cc: Thomas Gleixner Signed-off-by: Jason Cooper Signed-off-by: Greg Kroah-Hartman commit 4cd88080b9faf0addc76815e38195b88e0075aea Author: Harald Freudenberger Date: Wed Jan 22 13:01:33 2014 +0100 crypto: s390 - fix des and des3_ede ctr concurrency issue commit ee97dc7db4cbda33e4241c2d85b42d1835bc8a35 upstream. In s390 des and 3des ctr mode there is one preallocated page used to speed up the en/decryption. This page is not protected against concurrent usage and thus there is a potential of data corruption with multiple threads. The fix introduces locking/unlocking the ctr page and a slower fallback solution at concurrency situations. Signed-off-by: Harald Freudenberger Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 21d6cfc95cbdd193e29f55ad2fb0934ff99a9a80 Author: Harald Freudenberger Date: Wed Jan 22 13:00:04 2014 +0100 crypto: s390 - fix des and des3_ede cbc concurrency issue commit adc3fcf1552b6e406d172fd9690bbd1395053d13 upstream. In s390 des and des3_ede cbc mode the iv value is not protected against concurrency access and modifications from another running en/decrypt operation which is using the very same tfm struct instance. This fix copies the iv to the local stack before the crypto operation and stores the value back when done. Signed-off-by: Harald Freudenberger Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 5b64d8f705a69fd3a3b2859aaf253a7127150599 Author: Harald Freudenberger Date: Thu Jan 16 16:01:11 2014 +0100 crypto: s390 - fix concurrency issue in aes-ctr mode commit 0519e9ad89e5cd6e6b08398f57c6a71d9580564c upstream. The aes-ctr mode uses one preallocated page without any concurrency protection. When multiple threads run aes-ctr encryption or decryption this can lead to data corruption. The patch introduces locking for the page and a fallback solution with slower en/decryption performance in concurrency situations. Signed-off-by: Harald Freudenberger Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit a082f872743e09fe2cdf870b2d8f1b5dccd36cf7 Author: Josef Bacik Date: Wed Jan 29 16:05:30 2014 -0500 Btrfs: disable snapshot aware defrag for now commit 8101c8dbf6243ba517aab58d69bf1bc37d8b7b9c upstream. It's just broken and it's taking a lot of effort to fix it, so for now just disable it so people can defrag in peace. Thanks, Signed-off-by: Josef Bacik Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 95664c962213b9c5f9950255df8a78f6767678f2 Author: Stephen Smalley Date: Thu Jan 30 11:26:59 2014 -0500 SELinux: Fix kernel BUG on empty security contexts. commit 2172fa709ab32ca60e86179dc67d0857be8e2c98 upstream. Setting an empty security context (length=0) on a file will lead to incorrectly dereferencing the type and other fields of the security context structure, yielding a kernel BUG. As a zero-length security context is never valid, just reject all such security contexts whether coming from userspace via setxattr or coming from the filesystem upon a getxattr request by SELinux. Setting a security context value (empty or otherwise) unknown to SELinux in the first place is only possible for a root process (CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only if the corresponding SELinux mac_admin permission is also granted to the domain by policy. In Fedora policies, this is only allowed for specific domains such as livecd for setting down security contexts that are not defined in the build host policy. Reproducer: su setenforce 0 touch foo setfattr -n security.selinux foo Caveat: Relabeling or removing foo after doing the above may not be possible without booting with SELinux disabled. Any subsequent access to foo after doing the above will also trigger the BUG. BUG output from Matthew Thode: [ 473.893141] ------------[ cut here ]------------ [ 473.962110] kernel BUG at security/selinux/ss/services.c:654! [ 473.995314] invalid opcode: 0000 [#6] SMP [ 474.027196] Modules linked in: [ 474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G D I 3.13.0-grsec #1 [ 474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti: ffff8805f50cd488 [ 474.183707] RIP: 0010:[] [] context_struct_compute_av+0xce/0x308 [ 474.219954] RSP: 0018:ffff8805c0ac3c38 EFLAGS: 00010246 [ 474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX: 0000000000000100 [ 474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI: ffff8805e8aaa000 [ 474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09: 0000000000000006 [ 474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12: 0000000000000006 [ 474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15: 0000000000000000 [ 474.453816] FS: 00007f2e75220800(0000) GS:ffff88061fc00000(0000) knlGS:0000000000000000 [ 474.489254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4: 00000000000207f0 [ 474.556058] Stack: [ 474.584325] ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98 ffff8805f1190a40 [ 474.618913] ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990 ffff8805e8aac860 [ 474.653955] ffff8805c0ac3cb8 000700068113833a ffff880606c75060 ffff8805c0ac3d94 [ 474.690461] Call Trace: [ 474.723779] [] ? lookup_fast+0x1cd/0x22a [ 474.778049] [] security_compute_av+0xf4/0x20b [ 474.811398] [] avc_compute_av+0x2a/0x179 [ 474.843813] [] avc_has_perm+0x45/0xf4 [ 474.875694] [] inode_has_perm+0x2a/0x31 [ 474.907370] [] selinux_inode_getattr+0x3c/0x3e [ 474.938726] [] security_inode_getattr+0x1b/0x22 [ 474.970036] [] vfs_getattr+0x19/0x2d [ 475.000618] [] vfs_fstatat+0x54/0x91 [ 475.030402] [] vfs_lstat+0x19/0x1b [ 475.061097] [] SyS_newlstat+0x15/0x30 [ 475.094595] [] ? __audit_syscall_entry+0xa1/0xc3 [ 475.148405] [] system_call_fastpath+0x16/0x1b [ 475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48 8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7 75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8 [ 475.255884] RIP [] context_struct_compute_av+0xce/0x308 [ 475.296120] RSP [ 475.328734] ---[ end trace f076482e9d754adc ]--- Reported-by: Matthew Thode Signed-off-by: Stephen Smalley Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman